- Type System Revision
- Peephole of
if RegionandPhi- Discussion
- Expanding Peephole Beyond Constant Expressions
- Dominators
You can also read this chapter in a linear Git revision history on the linear branch and compare it to the previous chapter.
In this chapter we do not add new language features. Our goal is to add peephole optimization to if statements.
We now need to extend the type system to introduce another lattice structure, in addition to the Integer sub-lattice. As before we denote "Top" by T and "Bottom" by ⊥.
- "Ctrl" represents a live control, whereas "~Ctrl" represents a dead control.
- "INT" refers to an integer value, note that integer values have their own sub-lattice.
| ⊥ | Ctrl | ~Ctrl | T | INT | |
|---|---|---|---|---|---|
| T | ⊥ | Ctrl | ~Ctrl | T | INT |
| Ctrl | ⊥ | Ctrl | Ctrl | Ctrl | ⊥ |
| ~Ctrl | ⊥ | Ctrl | ~Ctrl | ~Ctrl | ⊥ |
| ⊥ | ⊥ | ⊥ | ⊥ | ⊥ | ⊥ |
- "Top" meets any type is that type,
- "Bottom" meets any type is "Bottom",
- "Top", "Bottom", "Ctrl", "~Ctrl" are classed as simple types,
- "Ctrl" meets "~Ctrl" is "Ctrl"
- Unless covered above, a simple type meets non-simple results in "Bottom"
The entire revised Lattice is shown below:
- Our general strategy is to continue parsing the entire
ifstatement even if we know that one branch is dead. - This approach creates dead nodes which get cleaned up by dead code elimination. The benefits are that we do not introduce special logic during parsing, the normal graph building machinery takes care of cleaning up dead nodes.
- When we peephole an
Ifnode we test if the predicate is a constant. If so, then one branch or the other dead, depending on if the constant is0or not. - When we add the
Projnodes, in this scenario, we already know one of the projections is dead. The peephole of the relevantProjnode replaces theProjwith aConstantof type "~Ctrl" indicating dead control. - At this point our dead code elimination would kill the
Ifnode since it has no uses- but we cannot do that because we need to continue the parse. The
important observation is that we can kill the
Ifnode after both theProjnodes have been created, because any subsequent control flow can only see theProjnodes.
- but we cannot do that because we need to continue the parse. The
important observation is that we can kill the
- To ensure this, we add a dummy user to
Ifbefore creating the firstProjand remove it immediately after, allowing the secondProjto trigger a dead code elimination of theIf. - The live
Projgets replaced by the parent ofIf, whereas the deadProjgets replaced by "~Ctrl". - The parsing then continues as normal.
- The other changes are when we reach the merge point with a dead control.
- Again keeping with our strategy we merge as normal, including creating
Phinodes. - The
Regionmay have have one of its inputs as "~Ctrl". This will be seen by thePhiwhich then optimizes itself. If thePhihas just one live input, thePhipeephole replaces itself with the remaining input. - Finally, at the end, when the
Regionnode is peepholed, we see that it has only one live input and no Phi uses and can be replaced with its one live input.
One of the invariants we maintain is that the for each control input to a
Region every Phi has a 1-to-1 relationship with a data input. Thus, if a
Region loses a control input, every Phi's corresponding data input must be
deleted. Conversely, we cannot collapse a Region until it has no dependent
Phis.
When processing If we do not remove control inputs to a Region, instead the
dead control input is simply set to Constant(~Ctrl). The peephole logic in
a Phi notices this and replaces itself with the live input.
With the changes above, our peephole optimization is able to fold dead code when
an if statement has a condition that evaluates to true or false at compile time.
Let's look at a simple example below:
if( true ) return 2;
return 1;Before peephole:
After peephole:
Another example:
int a=1;
if( true )
a=2;
else
a=3;
return a;Before peephole:
After peephole:
While our peephole of the CFG collapses dead code when the expression is
a compile time constant. it does not do as well as it could with following example:
int a = 0;
int b = 1;
if( arg ) {
a = 2;
if( arg ) { b = 2; }
else b = 3;
}
return a+b;In this example, arg is defined externally, and we have to assume it is not a constant.
However, the code repeats the if( arg ) condition in the inner if statement.
We know that if the outer if( arg ) is true, then the inner if must also evaluate to true.
Our peephole at present cannot see this:
What we would like is for our peephole to see that the inner if is fully determined by the
outer if and thus, the code can be simplified as follows:
int a = 0;
int b = 1;
if ( arg ) {
a = 2;
b = 2;
}
return a+b;That is, the final outcome is either 4 if arg happens to be true, or 1 if arg happens to be false.
The key to implementing this peephole is the observation that the inner if is completely determined by the
outer if. In compiler parlance, we say that the outer if dominates the inner if.
There is a generic way to determine this type of domination; it involves constructing a dominator
tree. For details of how to do this, please refer to the paper A Simple, Fast Dominance Algorithm.1
However, constructing a dominator tree is an expensive operation; one that we would not like to do in the middle of parsing. Fortunately, since we do not have loops yet, we do not have back edges in our control flow graph. Our control flow graph is essentially hierarchical, tree like. Thus, a simple form of dominator identification is possible.
The main idea is that we can assign a depth to each control node, this represents the node's depth
in the hierarchy. We can compute this depth incrementally as we need to. The nodes that are deeper in the hierarchy
have a greater depth than the ones above.
Secondly we know that for our control nodes, except for Regions, there is only one incoming control edge,
which means that most control nodes have a single immediate dominating control node. Region nodes are different
as they can have one or more control edges; but, we know that for if statements, a Region node has exactly two
control inputs, one from each branch of the if. To find the immediate dominator of a Region node in the
limited scenario of an if, we can follow the immediate dominator of each branch until we find the common ancestor
node that dominates both the branches.
If this ancestor node is also an if and its predicate is the same as the current if, then we know that
the outcome of the present if is fully determined.
The code that does this check is in the compute method of the if.
The idom method in Node provides a default implementation of the depth based immediate dominator calculation.
Region's override this to implement the more complex search described above.
It turns out that to simplify the example above, we need a further peephole in our Add node.
With all these changes, the outcome of the peephole is just what we would like to see!
Footnotes
-
Cooper, K. D. Cooper, Harvey, T. J., and Kennedy, K. (2001). A Simple, Fast Dominance Algorithm. ↩