This assignment is a part of the semantic analysis phase of the compiler that you construct in this course. You will add name analysis to your Extended MiniJava compiler in this assignment. Name analyzer can significantly simplify implementing the type checker. Part of your tasks for this assignment is to fix any problems that you still have in your previous phases (if any). The goal of name analysis is twofold: we want to reject programs which contain certain types of errors, and we want to associate symbols to all identifiers. If you have any questions about this assignment please do not hesitate to ask the instructor.
Symbols are values which uniquely identify all class, method and variable names in a program, by mapping their (multiple) occurrences to their (unique) definition. Identifiers are already present in the AST and contain names as well, but these are not sufficient to distinguish e.g. between a class member and a local variable with the same name. This week you will add this missing information to the ASTs.
In the process of mapping occurrences to definitions, the compiler will be able to detect the following kinds of errors in faulty programs:
class
is defined more than onceclass
members in a class have the same nameclass
member has the same name as one in an inherited class (field overriding is forbidden in Extended MiniJava)this
is referenced from the main function (this
is disallowed in a static method)Note: We cannot check for the following errors yet, because we need type checking for them:
Scopes in Extended MiniJava are only of three kinds:
This in fact defines a hierarchy among symbols:
As your compiler grows, testing your compiler can become a challenge. Sometimes you may encounter a bug in your compiler implementation that you cannot immediately understand from which phase the incorrect behavior comes from. A good way to debug your compiler is to write a pretty-printer for you AST. This can be done in a similar way as the pretty printing worked for the while language in the first assignment. After your compiler constructs an AST for a given input and attaches symbols to its identifiers, a call to the pretty pretty-printer will print the AST as a eMiniJava program. As we discussed in Lecture 19, in static scoping we can rename variables to avoid any shadowing (make all vars unique). So when pretty printing the current program:
class Example { public static void main(String [] args) { System.out.println(new B().foo()); } } class B extends A { public int foo() { val = 42; return val; } } class A { int val; public int foo() { boolean val; val = false; return 41; } }
The analyzer should print an output like this:
class Example_0_ { public static void main(String[] args_1_) { System.out.println((new B_2_().foo_#error#_())); } } class B_2_ extends A_3_ { public int foo_7_() { val_4_ = 42; return val_4_; } } class A_3_ { int val_4_; public int foo_5_() { boolean val_6_; val_6_ = false; return 41; } }
Note that:
Here is how we recommend you proceed for the implementation:
this
. Make sure to attach symbols to all identifiers, whether they are in expressions, statements, or type trees.fatal
error in this phase; the name analyzer should report as many number of errors as it can.Here are all the constraints that your analyzer should enforce:1)
Shadowing can occur in two different situations:
All other types of shadowing are not allowed in Extended MiniJava.
this
is only referred to from a class method (as opposed to the main method)Since we are not imposing any code structure for your programs, it is extremely important for your compiler to exactly follow a fixed interface as described here. This allows us to uniformly run and test all the projects from different groups of students. Command-line is the primary interface for your users to interact with your compiler. A general form for the command-line interface is as follows:
emjc [options] <source file>
For this phase of assignment, there are five possible options. If the user does not provide a correct option, the compiler treats it the same as the –help option.
––help
: Prints a synopsis of options––pp
: Pretty-prints the input file to the standard output––lex
: Generates output from lexical analysis as described in Assignment 2.––ast
: Generates output from syntactic analysis as described in Assignment 3.––name
: Generates output from name analysis
After executing the compiler with the option ––name
for the source file filename.emj
the compiler either accepts the program as a good eMiniJava program with printing out the following line:
Valid eMiniJava Program
Or, it prints out a set of errors (such as unknown identifier). Like the previous phases the format of an error message is the following:
<line>:<column> error:<description>
where <line> and <column> indicate the beginning position of the error, and <description> details the error.
Please ensure to test your code extensively before submitting your code. You can compare the outputs of your submission with other groups, but sharing code is forbidden among groups. The benchmarks are available in the following folder which is accessible from the CS Department Linux systems (e.g., queeg.cs.rit.edu and ICLs 1,2, and 3).
/usr/local/pub/hh/cc/benchmarks
The deadline of the this assignment is March 21st at 11:59pm (https://mycourses.rit.edu). Please upload all your files as a single zip file. You should include a readme.txt file in your package to describe how your source code can be compiled and built. Ideally your project should include a user-friendly compiling technique with Makefile, Ant or similar tools.
When grading, we are serious about simple programming errors. You are writing a compiler to check the programs of other people, so your compiler should not have simple errors itself! Your compiler should never crash for an incorrect input. We expect your compiler to give a comprehensive error when the user does not provide a valid input to it.