User Tools

Site Tools


cc17:eminijava

Extended MiniJava (eMiniJava)

MiniJava is a subset of the programming language Java as described in the Appendix of the Tiger book. Extended MiniJava (eMiniJava) is an extension to MiniJava that we use in this course. Each group of students will gradually develop a compiler for eMiniJava by implementing different stages of the compiler step-by-step.

General Overview

  • Object oriented. It supports classes with inheritance and method overriding (but not overloading).
  • Imperative. All class fields and local variables are mutable. While-loop and conditional statement are the main control structure.
  • It supports the following types: int, boolean, String, int array and reference types (classes).
  • All classes in a eMiniJava program are included in a single source file.

BNF

The syntax of eMiniJava is given by the following BNF grammar:

Program::=MainClass ( ClassDeclaration )* <EOF>
MainClass::=class Identifier { public static void main ( String [ ] Identifier ) { Statement } }
ClassDeclaration::=class Identifier ( extends Identifier )? { ( VarDeclaration )* ( MethodDeclaration )* }
VarDeclaration::=Type Identifier ;
MethodDeclaration::=public Type Identifier ( ( Type Identifier ( , Type Identifier )* )? ) { ( VarDeclaration )* ( Statement )* return Expression ; }
Type::=int
boolean
String
int [ ]
Identifier
Statement::={ ( Statement )* }
if ( Expression ) Statement ( else Statement )?
while ( Expression ) Statement
System.out.println ( Expression ) ;
Identifier = Expression ;
Identifier [ Expression ] = Expression ;
sidef ( Expression ) ;
Expression::=Expression ( && | || | == | < | + | - | * | / ) Expression
Expression [ Expression ]
Expression . length
Expression . Identifier ( ( Expression ( , Expression )* )? )
<INTEGER_LITERAL>
"<STRING_LITERAL>"
true
false
Identifier
this
new int [ Expression ]
new Identifier ( )
! Expression
( Expression )
Identifier::=<IDENTIFIER>
  • <IDENTIFIER> represents a sequence of letters, digits and underscores, starting with a letter. An identifier is not a keyword. Identifiers are case-sensitive.
  • <INTEGER_LITERAL> represents a sequence of digits
  • <STRING_LITERAL> represents a sequence of arbitrary characters, except new lines and ". You don't need to support escape characters such as \n.
  • <EOF> represents the special end-of-file character

Language Semantics

The precise way to describe the semantics of a programming language is by using mathematical description, for example operational semantics (see e.g. Lecture 3 of PLC course). Since studying semantics is not officially part of the Compiler Construction course and some students may not have the background knowledge for mathematical semantics, we give an informal description of the language constructs.

+

The + operator can be applied to both int and String operands. When + is applied to integers, the result is always integer. When at least one operand is String, the result is the concatenation of operands. For concatenation of an int to String the string representation of the integer operand is considered.

5 + 5           => 10
"comp" + "iler" => "compiler"
"comp" + 5      => "comp5"
5 + "comp"      => "5comp"

< , - , * , /

These are the usual arithmetic operators and are applied to int operands only.

==

Equality works on all pairs of operands that

  • either both belong to class types (even different ones), or
  • are of the same type (int, String, boolean, int[]).

The semantics of equality is

  • value-equality for int, boolean and string
  • reference-equality for objects and arrays
10 == 10                   => true
new A() == new A()         => false
new A() == new B()         => false
"comp" == 10               => // Type Error...
"c" + "omp" == "co" + "mp" => true

println

System.out.println can be used on int, Strings and Booleans.

sidef

The sidef keyword is used to call an expression (usually a method call) just for its side-effect, and to discard the result.

For example, if in a class we define

public int hi() {
  System.out.println("Hello World!");
  return 0;
}

then

  • sidef(this.hi()); will print Hello World! on the standard output and will discard the result
  • sidef(true); will do nothing.

&& , ||

Similar to Java && and || are short-circuit boolean operators.

public static boolean printHi() {
  System.out.println("Hi");
  return false;
}
public static boolean printBye() {
  System.out.println("Bye");
  return true;
}
printHi() && printBye() // "Hi"
printBye() || printHi() // "Bye"

!

! is the boolean negation.

Arrays

new int[size], where size: int returns a new array of size size.

a[i], where a: int[] and i: int, returns the value stored in the ith position of a.

a.length, where a: int[], returns the length of a.

a[i] = j, where a: int[], i: int and j: int, sets the value in the ith position of a to j.

Miscellaneous

  • Comments can be marked using the double slash notation (//) or blocks ( /* */ ). Nested blocks are not allowed.
  • The else branch of the if construct is optional.
  • Inheritance works as in Java. eMiniJava doesn't support the notions of interfaces, abstract members or abstract classes.
  • eMiniJava only supports default constructors (constructors without arguments).
  • Method overloading is not allowed, but method overriding is allowed. If you are unclear about the difference between the two, see for instance this page. The only overloading in eMiniJava is on the + operator, which can be used with integers, strings, and between the two types.
  • eMiniJava does not allow two fields with the same name in a class, or two classes that inherit each other (field overriding is not allowed).
  • eMiniJava does not allow two variables with the same name in a method (including parameters and locally defined variables). However, it allows to define a variable in a method whose enclosing class has a field of the same name. In that case, variable shadowing happens: the method variable takes precedence.
  • Accessing a field that has not been initialized results in undefined behavior (you can use a default value in this case, but your compiler must not crash in any case).
  • Only constant strings (strings given as literals) are allowed.
  • Although the grammar specifies that expressions such as "foobar".method() are legal, they have no meaning in eMiniJava (they would result in a type error). The only operations allowed are: concatenating strings with other strings or with integers, printing strings, passing strings as argument and returning strings.
  • The operator precedence is the same as in Java. From highest priority to lowest: !, then * and /, then + and -, then < and ==, then &&, then ||. Also, new binds tighter than . (method call or .length), which binds tighter than [] (array read), which binds tighter than any operator. So 1 + new Foo().bar().baz()[42] means 1 + ( ( ( (new Foo()).bar()).baz())[42]).
  • All binary operators are left-associative. E.g. 1-2+3 means (1-2)+3 and not 1-(2+3). (Of course this does not matter for operators of different precedence).

Examples

There are a set of MiniJava programs on the MiniJava webpage. Since eMiniJava is a superset of MiniJava, they are eMiniJava programs as well.

cc17/eminijava.txt · Last modified: 2017/02/20 17:17 by hossein