You have now written a complete compiler for eMiniJava and this is the final programming assignment of the course! In this assignment, you will improve your compiler by optimizing the generated code. There are plenty of ways to optimize a compiler. In this assignment you will optimize your compiler using either a forward or a backward data flow analysis approach. There are two main tasks in this assignment: first to convert your annotated ASTs to Control Flow Graphs (CFGs). The CFGs will then be used in the second task to perform the actual analysis. If you have any questions about this assignment please do not hesitate to ask the instructor.
Please note that optimization should not break the correctness of the generated code. In case that your compiler generates wrong programs because of optimization, you will significantly lose points in this assignment.
Write a program to generate a control-flow-graph for each method in the input file (i.e. all methods of all classes), except for the main
method which we won't bother analyzing.
The input to your program is the type annotated AST after the semantic analysis phase of your compiler. The output of your program is a set of graphs that you will use the dot
program to visualize them.
Note that during the generation of the control flow graph you will need to introduce new fresh variables when flattening an expression.
In general, it is not necessary to attach symbols (and type information) to the new fresh variables in your control-flow graph. We assume that type checking already eliminated all type-incorrect programs. The analysis of the optimizer also does not need the type information for the new fresh variables. However, for the variables that are used from the program, you may need to attach symbols from their AST nodes in order to trace them back in the original program.
Your compiler should generate one .dot
file per method in the input file (all methods of all classes, that is), except for the main
method which we won't bother analyzing. Here is for instance the graph generated for the method ComputeFac
of the Factorial.java
example:
//Factorial class Factorial{ public static void main(String[] a){ System.out.println(new Fac().ComputeFac(10)); } } class Fac { public int ComputeFac(int num){ int num_aux ; if (num < 1) num_aux = 1 ; else num_aux = num * (this.ComputeFac(num-1)) ; return num_aux ; } } | ![]() |
You can view .dot
files using dotty
from the GraphViz suite, or by converting them to .ps
or .jpg
files using:
dot file.dot -Tps -o out.ps
or:
dot file.dot -Tjpg -o out.jpg
If you are using Windows, you can download a user interface from the GraphViz website to produce similar results.
Implement either elimination of assignments to dead variables (backward analysis) as in Lecture 33 or available expression analysis (forward analysis) as in Lecture 34. For this assignment you are allowed to implement another kind of forward or backward data flow analysis, but you need to discuss it first with the instructor and only after confirmation you can implement your favorite analysis. Note that in this assignment we are not interested in precise inter-procedural analysis, so you can follow a worst-case analysis assumption as we discussed in the course.
The library of benchmarks in the competition includes the benchmarks that you contributed to it and is accessible from here:
/usr/local/pub/hh/cc/benchmarks
For this project you need to submit at least one benchmark to show that your optimization is effective.
Since we are not imposing any code structure for your programs, it is extremely important for your compiler to exactly follow a fixed interface as described here. This allows us to uniformly run and test all the projects from different groups of students. Command-line is the primary interface for your users to interact with your compiler. A general form for the command-line interface is as follows:
emjc [options] <source file>
These are the possible options for your compiler. If the user does not provide a correct option, the compiler treats it the same as the –help option.
––help
: Prints a synopsis of options––pp
: Pretty-prints the input file to the standard output––lex
: Generates output from lexical analysis as described in Assignment 2.––ast
: Generates output from syntactic analysis as described in Assignment 3.––name
: Generates output from name analysis as described in Assignment 4.––type
: Generates output from type analysis as described in Assignment 5.––cgen
: Generates output from code generation as described in Assignment 6.––cfg
: Generates dot
files for the control-flow graphs––opt
: Generates optimized executable program.––optinfo
: Generates optimized executable program with some statistics.
The first new option that you add in this assignment (––cfg
) generates the control flow graphs.
After executing the compiler with the option ––opt
for the source file filename.emj
the compiler silently generates optimized class files for a valid eMiniJava program by e.g. removing the assignments to dead variables (or another technique as described above).
The option ––optinfo
has the same effect as ––opt
, with the difference that it prints out the number of generated bytecode instructions before and after optimization:
#bytecode before optimization: <number> #bytecode after optimization: <number>
In the case that the program has errors, your compiler front-end should work as before. It prints out a set of errors (such as unknown identifier) in the following format:
<line>:<column> error:<description>
where <line> and <column> indicate the beginning position of the error, and <description> details the error.
The deadline of the this assignment is April 27th at 11:59pm (https://mycourses.rit.edu). Please upload all your files as a single zip file. You should include a readme.txt file in your package to describe how your source code can be compiled and built. Ideally your project should include a user-friendly compiling technique with Makefile, Ant or similar tools. For this assignment, your submission should include benchmark(s) that show that your optimizations work in practice.
When grading, we are serious about simple programming errors. You are writing a compiler to check the programs of other people, so your compiler should not have simple errors itself! Your compiler should never crash for an incorrect input. We expect your compiler to give a comprehensive error when the user does not provide a valid input to it.