# ### Site Tools

plt17:assignment_2

# The λ-calculus in Coq

## Introduction

The goal of this assignment is to make you comfortable in proving theorems with Coq. It will be really helpful if you have access to a list of Coq tactics (for example from here) during this assignment. There are easy ways to “prove” theorems in Coq (e.g. by using the tactic Admitted). You are not allowed to use those proof techniques in this assignment. There are a couple of Coq commands that you may find them useful. Coq's Search command displays the name and the type of all objects (theorems, axioms, …) that contain a given identifier. Moreover, if you want to see the definition of an object, you can use the Print command.

## Coq IDE

To use Coq effectively, you must have a working IDE for it. There are two in common use: Proof General that runs under Emacs and CoqIDE (usually bundled with Coq installation). Make sure that you have installed one of those IDEs (the one that you feel more comfortable with it) for this assignment.

In this project, you will continue working with the λ-calculus definition from Lecture 10. Define a function that calculates the set of free variables in a λ-expression. A set can be defined using a predicate $P(x)$ that determines if an object belongs to the set or not $\{x | P(x)\}$. Set is a collection of objects for which $P$ is true. Define the function free that gets a variable and determines if it is free in a λ-expression or not.

Fixpoint free (x: string) (e: term) : Prop.

Prove the following theorem: if two λ-expressions are equal up to α-conversion, then their set of free variables are also equal.

Prove the following theorem: if a variable does not appear free in a term, then substituting it with any other expression has no effect.

Theorem subset_fresh : forall e1 e x, ~ (free x e) → subst e1 x e = e.

As we discussed in the class, during β-reductions we may need to rename some of the bound variables to avoid the name clashes (α-conversion). When implementing interpreters for the λ-calculus, generating fresh variables for every reduction can be cumbersome. A good way to avoid variable renaming is to use the De Bruijn indices. In this notation, the bound variable occurrences are replaced with positive integers (called de Bruijn indices). The positive integer $n$ refers to the variable bound by the $n$-th surrounding binder. For example, the λ-term $\lambda x. \lambda y. x$ is written with de Bruijn indices as $\lambda(\lambda 2)$, whereas $\lambda x. \lambda y.y$ is written as $\lambda(\lambda 1)$.

Notice that there can be λ-expressions that do not have any meaning in this representation. For example, $\lambda 2$ does not have any meaning. Define an inductive data type called dbterm (similar to the λ-expression definition that was defined in class) for λ-expressions in the de Bruijn notation.

Write the function isValid (t: dbterm) : bool which determines if the given λ-expression t is in correct De Bruijn encoding. Test your isValid function with some values using Eval compute in …. For example, your function should return false for $\lambda 2$.

Note 1: Operations such as <, >, = are defined to return Prop in Coq, which can't be used e.g. in if/then/else. You need to use the operations beq_nat (natural number equality) and leb (less than or equal) from Arith for comparisons that return bool.

Note 2: If you are interested in the differences between Prop and bool, you can read the Chapter 4 from “Certified Programming with Dependent Types”.

β-reduction must be adapted for the de Bruijn notation. In order to reduce $(\lambda x. e) y$, it does not suffice to substitute $x$ with $y$ in the appropriate places. If there are occurrences of $2, 3, 4, \cdots$ in $e$, these become one off, since one of the λ binders surrounding $e$ has been removed. Hence, all the remaining free indices in $e$ must be decremented. So the reduction $(\lambda x. e) y ~\rightarrow_\beta~ e [x \mapsto y]$ becomes conceptually a reduction as the following: $(\lambda x. e) y ~\rightarrow_\beta~ e[1 \mapsto y, 2 \mapsto 1, 3 \mapsto 2, \cdots]$.
For pushing the substitution on right hand side inside $e$, we need to have an operation called “lifting”. The operation lift increments the indices greater or equal than $k$ inside a λ-expression by $n$. Write the function lift for a given λ-expression.

Assume we make the following conjecture: if the λ-expression $e$ has a correct de Bruijn encoding, then the result of lifting $e$ still has a correct de Bruijn encoding. Can you prove the conjecture in general? Do you think if this is a correct statement? If the conjecture is wrong, try to formulate and prove a weaker version of the conjecture which says if $e$ has a correct de Bruijn encoding then for specific values of $k$ and $n$ (you need to come up with those values) then lifting $e$ has a correct de Bruijn encoding. 