Assignment 4: Compiler passes
The compiler passes are described in chapter 5 of the textbook
and in the lectures,
but here they are again for completeness.
(You should still read the book and lectures for a more in-depth explanation!)
We will only include passes that you have to implement.
As before, the only files you should modify are the .ml files
corresponding to these compiler passes.
A. Passes that are unchanged from the previous assignment
These passes are identical except possibly for the names of imported
modules (e.g. Cif changing to Cloop).
- remove unused blocks (but see note below)
- build interference graph
- graph coloring
- allocate registers
- remove jumps (but see note below)
- patch instructions
- add prelude and conclusion
Note
The "remove unused blocks" and the "remove jumps" passes may or may not require modifications, depending on how you implemented them in the previous assignment. If you took advantage of the topological ordering of blocks in that assignment, you'll have to change that here. If not, you won't.
B. Passes that have minimal modifications from the previous assignment
These passes have to be extended to handle the new forms, but the extensions should be straightforward.
-
shrink
-
uniquify
Note
In the "uniquify" pass,
note that set! requires the variable being set
to have already been renamed.
Otherwise, signal an error.
C. Passes that are new
Uncover get! (uncover_get.ml)
This pass is discussed in lecture 13 and in the textbook.
The idea is to annotate any variable expression that corresponds
to a variable which is reassigned by set!.
Such variables are converted to get! expressions,
so Var x becomes GetBang x.
This is used in the "remove complex operands" pass,
as described in lecture 13.
Implementing this pass is quite simple. You go through the code twice.
The first time, you collect a set of all the variables that are assigned to
in a set! expression. The second time,
you convert any variable reference to one of these variables
to a get! (GetBang) form.
Since variable names are unique (thanks to the "uniquify" pass),
any reference to a variable which is assigned to in a set! expression
must refer to the same variable as any reference to that variable.
D. Passes with significant modifications from the previous assignment
1. Remove complex operands (remove_complex.ml)
In this pass, the set!, begin, and while forms
are all considered to be complex operands.
In addition, the subexpressions of set!, begin, and while
are allowed to be complex.
Also, Void has to be added as an additional case.
The get! (GetBang) form added in the "uncover get" pass
is converted back to a regular variable (Var) in this pass.
In rco_atom, the variable is bound to a new fresh variable name
which becomes the atom returned from the function.
In rco_exp, the variable is simply the returned expression.
It's legal to pass set! and while expressions to the
rco_atom function, since they return Void values.
These would not normally be function arguments,
but we handle this case for completeness.
In such cases, rco_atom returns the Void atom,
and binds the expression to the dummy variable $_,
which will never be accessed.
2. Explicate control (explicate_control.ml)
The following changes need to be made to the "explicate control" pass.
-
New forms need to be handled in
convert_atomandconvert_exp. This should be straightforward. -
The new
while,begin, andset!forms need to be handled inexplicate_assign. Both thewhileandbeginforms have side effects; these are handled in the newexplicate_effectfunction described below, and that function is called fromexplicate_assign. The return value ofbeginis the return value of the last expression; all others are evaluated for side effects. The return values ofwhileandset!areVoid. Make sure to accumulate the subexpressions ofbeginin the correct order. (Hint: theList.fold_rightfunction will be useful.) -
The
explicate_predfunction has to be extended to deal with thewhile,begin, andset!forms. Sincewhileandset!can only returnVoid, not booleans, handling them is easy: it's an error. Abegincan return a boolean, but the subexpressions need to be handled first, and in the correct order. -
The
explicate_tailfunction has to be extended to deal with thewhile,begin, andset!forms. This will involve calls toexplicate_assignand/orexplicate_effectas needed. -
The new function
explicate_effecthas been added to deal with forms that have side effects. Note that any form can be in a side-effecting position (e.g.inside abeginexpression), even ones that have no side effects. These pure expressions should be discarded because they have no effect (they are "dead code"). Effectful primitives (readandprint) can be converted using thePrimSconstructor. Converting the rest of the forms should be straightforward, with the exception of theifandwhileforms.
The form (if <test> <then> <else>), when evaluated for its effects,
is going to either evaluate the <then> clause or the <else>
clause for its effects. That means that those clauses will have to
be processed by explicate_effect as well. When doing this,
be careful not to duplicate the original tail passed to the
explicate_effect function; to avoid that, make it into a block
and use a Goto to that block.
The form (while <test> <body>) followed by a tail tail
should be converted to the equivalent of this code:
```
loop:
if test then
body
goto loop
else
tail
```
You will need to generate the loop label (marked as loop here).
Use the fresh function as you did in the uniquify pass
with the base loop and the separator _
so you get loop_N for some number N.
The body of the loop is evaluated only for its effects,
so it has to be processed by explicate_effect.
The if equivalent form has to be processed by explicate_pred,
returning a tail which constitutes a basic block.
The loop label will correspond to this block;
this has to be added to the basic_blocks variable
and a Goto to this block's label should be returned.
(Don't call create_block here, since explicate_pred
will do this if necessary.)
See section 5.6 in the book for further discussion of this.
3. Select instructions (select_instructions.ml)
The following changes need to be made to the "select instructions" pass.
-
A
Voidexpression becomes the integer0. -
A
Readcan be a stand-alone statement, as well as an expression. Stand-alone statements use the newPrimSconstructor of the "Cloop" intermediate language. -
The
Printfunction we added can also be a stand-alone statement. (Note thatPrintis our addition; it's not in the textbook.) Its return value (Void) can also be assigned to a variable, and it can be a tail expression.Printtakes one argument, which must be placed into the%rdiregister before calling the function.
Note that just as read in the source language becomes read_int
in assembly language, print in the source language
becomes print_int in assembly language.
4. Uncover live (uncover_live.ml)
The only parts of the "uncover live" pass that have changed are:
-
The
order_labelsfunction has been replaced with thecompute_livenessfunction, which uses the dataflow analysis described in sections 5.2 and 5.8 of the textbook, and also in lecture 14. This function iteratively computes the live-before sets of all blocks until it reaches a fixpoint. This is necessary because the control flow graph is no longer a directed acyclic graph in the presence of looping constructs, so we can't use a topological sort as we did in the previous compiler. -
The
uncover_livefunction has been modified to use thecompute_livenessfunction. Note that this function is supplied to you in its entirety (which wasn't the case in the previous assignment's compiler), because the steps are straightforward.
Therefore, your job for this pass
is to implement the compute_liveness function.
The algorithm is described in some detail
in the comment preceding the function.
This algorithm is not extremely difficult,
but we recommend that you use the _debug variable
and print out various important data structures
if debugging is enabled.
For one thing, you should check that the algorithm
really does converge to a final state.
The _debug variable (which is a ref, but doesn't really need to be)
is for your benefit. If you want to put in debugging code that can be
switched off, you can write something like this:
If the _debug variable is set to true,
the debug messages will be printed.
If it's set to false, they won't be.
If you change it, you have to recompile your code for it to take effect.