Assignment 4: Compiler passes
The compiler passes are described in chapter 5 of the textbook
and in the lectures,
but here they are again for completeness.
(You should still read the book and lectures for a more in-depth explanation!)
We will only include passes that you have to implement.
As before, the only files you should modify are the .ml
files
corresponding to these compiler passes.
A. Passes that are unchanged from the previous assignment
These passes are identical except possibly for the names of imported
modules (e.g.
Cif
changing to Cloop
).
- remove unused blocks
- build interference graph
- graph coloring
- allocate registers
- remove jumps (but see note below)
- patch instructions
- add prelude and conclusion
Note
The "remove jumps" pass may or may not require modifications, depending on how you implemented it in the previous assignment. If you took advantage of the topological ordering of blocks in that assignment, you'll have to change that here. If not, you won't.
B. Passes that have minimal modifications from the previous assignment
These passes have to be extended to handle the new forms, but the extensions should be straightforward.
-
shrink
-
uniquify
Note
In the "uniquify" pass,
note that set!
requires the variable being set
to have already been renamed.
Otherwise, signal an error.
C. Passes that are new
Uncover get!
(uncover_get.ml
)
This pass is discussed in lecture 13 and in the textbook.
The idea is to annotate any variable expression that corresponds
to a variable which is reassigned by set!
.
Such variables are converted to get!
expressions,
so Var x
becomes GetBang x
.
This is used in the "remove complex operands" pass,
as described in lecture 13.
Implementing this pass is quite simple. You go through the code twice.
The first time, you collect a set of all the variables that are assigned to
in a set!
expression. The second time,
you convert any variable reference to one of these variables
to a get!
(GetBang
) form.
Since variable names are unique (thanks to the "uniquify" pass),
any reference to a variable which is assigned to in a set!
expression
must refer to the same variable as any reference to that variable.
D. Passes with significant modifications from the previous assignment
1. Remove complex operands (remove_complex.ml
)
In this pass, the set!
, begin
, and while
forms
are all considered to be complex operands.
In addition, the subexpressions of set!
, begin
, and while
are allowed to be complex.
Also, Void
has to be added as an additional case.
The get!
(GetBang
) form added in the "uncover get" pass
is converted back to a regular variable (Var
) in this pass.
In rco_atom
, the variable is bound to a new fresh variable name
which becomes the atom returned from the function.
In rco_exp
, the variable is simply the returned expression.
It's legal to pass set!
and while
expressions to the
rco_atom
function, since they return Void
values.
These would not normally be function arguments,
but we handle this case for completeness.
In such cases, rco_atom
returns the Void
atom,
and binds the expression to the dummy variable $_
,
which will never be accessed.
2. Explicate control (explicate_control.ml
)
The following changes need to be made to the "explicate control" pass.
-
New forms need to be handled in
convert_atom
andconvert_exp
. This should be straightforward. -
The new
while
,begin
, andset!
forms need to be handled inexplicate_assign
. Both thewhile
andbegin
forms have side effects; these are handled in the newexplicate_effect
function described below, and that function is called fromexplicate_assign
. The return value ofbegin
is the return value of the last expression; all others are evaluated for side effects. The return values ofwhile
andset!
areVoid
. Make sure to accumulate the subexpressions ofbegin
in the correct order. (Hint: theList.fold_right
function will be useful.) -
The
explicate_pred
function has to be extended to deal with thewhile
,begin
, andset!
forms. Sincewhile
andset!
can only returnVoid
, not booleans, handling them is easy: it's an error. Abegin
can return a boolean, but the subexpressions need to be handled first, and in the correct order. -
The
explicate_tail
function has to be extended to deal with thewhile
,begin
, andset!
forms. This will involve calls toexplicate_assign
and/orexplicate_effect
as needed. -
The new function
explicate_effect
has been added to deal with forms that have side effects. Note that any form can be in a side-effecting position (e.g.
inside abegin
expression), even ones that have no side effects. These pure expressions should be discarded because they have no effect. Effectful primitives (read
andprint
) can be converted using thePrimS
constructor. Converting the rest of the forms should be straightforward, with the exception of theif
andwhile
forms.
The form (if <test> <then> <else>)
, when evaluated for its effects,
is going to either evaluate the <then>
clause or the <else>
clause for its effects. That means that those clauses will have to
be processed by explicate_effect
as well. When doing this,
be careful not to duplicate the original tail passed to the
explicate_effect
function; to avoid that, make it into a block
and use a Goto
to that block.
The form (while <test> <body>)
followed by a tail tail
should be converted to the equivalent of this code:
```
loop:
if test then
body
goto loop
else
tail
```
You will need to generate the loop label (marked as loop
here).
Use the fresh
function as you did in the uniquify
pass
with the base loop
and the separator _
so you get loop_N
for some number N
.
The body of the loop is evaluated only for its effects,
so it has to be processed by explicate_effect
.
The if
equivalent form has to be processed by explicate_pred
,
returning a tail which constitutes a basic block.
The loop label will correspond to this block;
this has to be added to the basic_blocks
variable
and a Goto
to this block's label should be returned.
(Don't call create_block
here, since explicate_pred
will do this if necessary.)
See section 5.6 in the book for further discussion of this.
3. Select instructions (select_instructions.ml
)
The following changes need to be made to the "select instructions" pass.
-
A
Void
expression becomes the integer0
. -
A
Read
can be a stand-alone statement, as well as an expression. Stand-alone statements use the newPrimS
constructor of the "Cloop" intermediate language. -
The
Print
function we added can also be a stand-alone statement. (Note thatPrint
is our addition; it's not in the textbook.) Its return value (Void
) can also be assigned to a variable, and it can be a tail expression.Print
takes one argument, which must be placed into the%rdi
register before calling the function.
Note that just as read
in the source language becomes read_int
in assembly language, print
in the source language
becomes print_int
in assembly language.
4. Uncover live (uncover_live.ml
)
The only parts of the "uncover live" pass that have changed are:
-
The
order_labels
function has been replaced with thecompute_liveness
function, which uses the dataflow analysis described in sections 5.2 and 5.8 of the textbook, and also in lecture 14. This function iteratively computes the live-before sets of all blocks until it reaches a fixpoint. This is necessary because the control flow graph is no longer a directed acyclic graph in the presence of looping constructs, so we can't use a topological sort as we did in the previous compiler. -
The
uncover_live
function has been modified to use thecompute_liveness
function. Note that this function is supplied to you in its entirety (which wasn't the case in the previous assignment's compiler), because the steps are straightforward.
Therefore, your job for this pass
is to implement the compute_liveness
function.
The algorithm is described in some detail
in the comment preceding the function.
This algorithm is not extremely difficult,
but we recommend that you use the _debug
variable
and print out various important data structures
if debugging is enabled.
For one thing, you should check that the algorithm
really does converge to a final state.
The _debug
variable (which is a ref
, but doesn't really need to be)
is for your benefit. If you want to put in debugging code that can be
switched off, you can write something like this:
If the _debug
variable is set to true
,
the debug messages will be printed.
If it's set to false
, they won't be.
If you change it, you have to recompile your code for it to take effect.