cc17:homework_2

**Due Monday, March 27, 9:00am.** Please hand it in before the beginning of the recitation session.

A context-free grammar is in Greibach two-standard form if productions are of the following form.

X -> aYZ X -> aY X -> a

- Prove that for any context-free grammar that does not contain ε there exists an equivalent Greibach two-standard grammar.
- Using the Greibach two-standard form prove that the class of context-free languages can be accepted by pushdown automaton.

**Hint:** You can assume that the grammar is in Chomsky Normal Form before working out the conversion to Greibach two-standard form.

Show that if a grammar is in Chomsky normal form then the parse tree for a word of length $n > 0$ has exactly $2n - 1$ interior nodes.

Assume a grammar in Chomsky normal has $n$ non-terminals. Show that if the grammar can generate a word with a derivation having at least $2^n$ steps, then the recognized language should be infinite.

Assume that we want to use the CYK algorithm for the grammars which are not in Chomsky normal form. For example, consider the following grammar for balanced parenthesis.

S -> ( S ) S -> SS S -> ()

The diagram below shows the parsing for "(()())" using CYK.

Describe why it is not a good idea to use CYK for the arbitrary grammars not in the Chomsky normal form.

A production is called linear if it is of the form A → aBb. In other words, if the right hand side can contain only one non-terminal. Show that there are context free languages for which no linear grammar exists.

Show that the regular languages can be recognized with LL(1) parsers. Describe a process that, given a regular expression, constructs an LL(1) parser for it.

This problem is motivated by creating type aliases for function types in C++.
For example, the following definition creates an alias called `int_bool_int_type`

that represents the type of a function that gets arguments of type integer and boolean and returns an integer.

`typedef int (int_bool_int_type) (int,bool);`

The following grammar generates type aliasing definitions for functions that works only with integers and booleans.

S → "typedef" "int" "(" identifier ")" "(" Type ")" | "typedef" "bool" "(" identifier ")" "(" Type ")" Type → Type "," Type | "int" | "bool"

- By giving at least one input which has two different parse trees prove that the grammar is ambiguous.
- Write a new grammar that determines the same language but is not ambiguous.

The following grammar generates all the regular expressions over $\Sigma = \{a,b,(,),+,*\}$.

S -> S S S -> S '+' S S -> S '*' S -> '(' S ')' S -> 'a' | 'b'

- Compute the
**First**and**Follow**sets for S. - By giving two different parse trees for an input, show that the grammar is ambiguous.
- Assuming that the '*' has the highest and '+' has the lowest precedence, rewrite the grammar to an unambiguous one.
- Compute the
**First**and**Follow**sets for the non-terminals of the new grammar. - Is the new grammar LL(1)?

Construct an SLR parse table for the following grammar.

S -> E + E | num E -> E * num | num

- Is SLR powerful enough to resolve the conflicts?
- Draw the LR(1) automaton for parsing.
- Determine the set of states that can be merged for an LALR parser.

cc17/homework_2.txt · Last modified: 2017/03/14 15:55 by hossein