cc17:homework_2

# Homework 02

Due Monday, March 27, 9:00am. Please hand it in before the beginning of the recitation session.

## Problem 1

A context-free grammar is in Greibach two-standard form if productions are of the following form.

X -> aYZ
X -> aY
X -> a
• Prove that for any context-free grammar that does not contain ε there exists an equivalent Greibach two-standard grammar.
• Using the Greibach two-standard form prove that the class of context-free languages can be accepted by pushdown automaton.

Hint: You can assume that the grammar is in Chomsky Normal Form before working out the conversion to Greibach two-standard form.

## Problem 2

Show that if a grammar is in Chomsky normal form then the parse tree for a word of length $n > 0$ has exactly $2n - 1$ interior nodes.

## Problem 3

Assume a grammar in Chomsky normal has $n$ non-terminals. Show that if the grammar can generate a word with a derivation having at least $2^n$ steps, then the recognized language should be infinite.

## Problem 4

Assume that we want to use the CYK algorithm for the grammars which are not in Chomsky normal form. For example, consider the following grammar for balanced parenthesis.

S -> ( S )
S -> SS
S -> ()

The diagram below shows the parsing for "(()())" using CYK.

Describe why it is not a good idea to use CYK for the arbitrary grammars not in the Chomsky normal form.

## Problem 5

A production is called linear if it is of the form A → aBb. In other words, if the right hand side can contain only one non-terminal. Show that there are context free languages for which no linear grammar exists.

## Problem 6

Show that the regular languages can be recognized with LL(1) parsers. Describe a process that, given a regular expression, constructs an LL(1) parser for it.

## Problem 7

This problem is motivated by creating type aliases for function types in C++. For example, the following definition creates an alias called int_bool_int_type that represents the type of a function that gets arguments of type integer and boolean and returns an integer.

typedef int (int_bool_int_type) (int,bool);

The following grammar generates type aliasing definitions for functions that works only with integers and booleans.

S       → "typedef" "int"  "(" identifier ")" "(" Type ")"
| "typedef" "bool" "(" identifier ")" "(" Type ")"
Type    → Type "," Type
| "int"
| "bool"
• By giving at least one input which has two different parse trees prove that the grammar is ambiguous.
• Write a new grammar that determines the same language but is not ambiguous.

## Problem 8

The following grammar generates all the regular expressions over $\Sigma = \{a,b,(,),+,*\}$.

S -> S S
S -> S '+' S
S -> S '*'
S -> '(' S ')'
S -> 'a' | 'b'
• Compute the First and Follow sets for S.
• By giving two different parse trees for an input, show that the grammar is ambiguous.
• Assuming that the '*' has the highest and '+' has the lowest precedence, rewrite the grammar to an unambiguous one.
• Compute the First and Follow sets for the non-terminals of the new grammar.
• Is the new grammar LL(1)?

## Problem 9

Construct an SLR parse table for the following grammar.

S -> E + E | num
E -> E * num | num
• Is SLR powerful enough to resolve the conflicts?
• Draw the LR(1) automaton for parsing.
• Determine the set of states that can be merged for an LALR parser. 