Due Monday, March 27, 9:00am. Please hand it in before the beginning of the recitation session.
A context-free grammar is in Greibach two-standard form if productions are of the following form.
X -> aYZ X -> aY X -> a
Hint: You can assume that the grammar is in Chomsky Normal Form before working out the conversion to Greibach two-standard form.
Show that if a grammar is in Chomsky normal form then the parse tree for a word of length $n > 0$ has exactly $2n - 1$ interior nodes.
Assume a grammar in Chomsky normal has $n$ non-terminals. Show that if the grammar can generate a word with a derivation having at least $2^n$ steps, then the recognized language should be infinite.
Assume that we want to use the CYK algorithm for the grammars which are not in Chomsky normal form. For example, consider the following grammar for balanced parenthesis.
S -> ( S ) S -> SS S -> ()
The diagram below shows the parsing for "(()())" using CYK.
Describe why it is not a good idea to use CYK for the arbitrary grammars not in the Chomsky normal form.
A production is called linear if it is of the form A → aBb. In other words, if the right hand side can contain only one non-terminal. Show that there are context free languages for which no linear grammar exists.
Show that the regular languages can be recognized with LL(1) parsers. Describe a process that, given a regular expression, constructs an LL(1) parser for it.
This problem is motivated by creating type aliases for function types in C++.
For example, the following definition creates an alias called int_bool_int_type
that represents the type of a function that gets arguments of type integer and boolean and returns an integer.
typedef int (int_bool_int_type) (int,bool);
The following grammar generates type aliasing definitions for functions that works only with integers and booleans.
S → "typedef" "int" "(" identifier ")" "(" Type ")" | "typedef" "bool" "(" identifier ")" "(" Type ")" Type → Type "," Type | "int" | "bool"
The following grammar generates all the regular expressions over $\Sigma = \{a,b,(,),+,*\}$.
S -> S S S -> S '+' S S -> S '*' S -> '(' S ')' S -> 'a' | 'b'
Construct an SLR parse table for the following grammar.
S -> E + E | num E -> E * num | num