# Lambda Calculus

## Introduction and Objectives

There are three major paradigms of programming languages that are popular today (and a couple more
worth your consideration). Each of these paradigms can be reduced to a specific *model of computation*.
In this chapter we will consider the $\lambda$-calculus, which is the model of computation for the functional
programming languages.

When you are done reading this and working through the exercises, you should be able to explain the three constructs that are part of $\lambda$-calculus, show how to simulate arithmetic and recursion, and explain why a computer scientist should be concerned with a theoretical language that can’t even be bothered to have integers or booleans.

## The Language

Alonzo Church and Stephen Kleene developed $\lambda$-calculus in the 1930’s to reason about the properties of functions. By giving a fully general set of rules for evaluating a function we are able to reason about the computational aspects of functions. (An alternative is to think of functions as a set of pairs of arguments and values. In this case we emphasize the relation between an argument and a value rather than the computation necessary to convert an argument to a value.)

To this day $\lambda$-calculus is very popular with programming language researchers. This is for two reasons. First, $\lambda$-calculus is Turing complete---any computation that can be done in a “real” language can be done in $\lambda$-calculus. Second, the language is very small, allowing us to focus on the features we want to study without being distracted by the mechanics of the language itself. It is sometimes called “the little white mouse” of programming language research.

The set of lambda terms $\Lambda$ consists of variables, functions, and function applications. Let $M$ and $N$ represent an arbitrary lambda terms and $x$ represent an arbitrary variable. Then we can define the syntax of $\Lambda$ as:

$\begin{array}{lll} \Lambda ::= & x & \textrm{variables} \\ & M N & \textrm{function application} \\ & \lambda x. M & \textrm{functions, a.k.a. $\lambda$ abstractions} \\ \end{array}$A *variable* in our presentation of $\lambda$-calculus will be a symbol of one
letter, possibly with “decorations” such as superscripts, subscripts, primes,
etc. So, $x$, $y^3$, $x_n$, $a'$, $\hat z$ are all examples of valid variable
names. $a3$, $foo$, and $bar$ are not valid, because their names are more than
one symbol long.

A *function application* is denoted by juxtaposition. We will usually *not* separate
terms with spaces, and instead “run them together.” This works because all
variable names will be one letter long.

Here are some example function applications: $fx$, $a (mn)$, $fxy$ The first example is $f$ applied to $x$. The second example applies $m$ to $n$, and then applies $a$ to that result. The third example applies $f$ to two variables $x$ and $y$. Function application associates to the left, so $abcd$ is the same as $((ab)c)d$.

A *function* is denoted by the Greek letter $\lambda$ (pronounced *LAM-duh*),
followed by a parameter name(s), then a period, and then the body of the
function. The usual convention is that the body of the function extends as far
as it can go. Functions are also called $\lambda$-*abstractions* or just
*abstractions*.

Here are some example functions:

$\lambda x. x$ | The identity function. |

$\lambda a . \lambda b . a$ | Takes two arguments and returns the first |

$\lambda f . \lambda x . f(fx)$ | Takes two arguments and applies the first one to the second one twice. |

It is common to “stack up” parameters of adjacent $\lambda$‘s. So

$\lambda a . \lambda b . a$ | $\equiv$ | $\lambda a b.a$ |

$\lambda f . \lambda x . f (f x)$ | $\equiv$ | $\lambda f x.f (f x)$ |

In longer expressions you may need parentheses to be sure that they are interpreted correctly.

$\lambda x. x \lambda y.y$ | This is a function that applies $x$ to the identity. |

$(\lambda x. x) \lambda y.y$ | This is the identity function applied to the identity function. |

### Question

Which of the following are valid $\lambda$ expressions?

1 | $\lambda x.xyz$ |

2 | $\lambda x.\lambda y.$ |

3 | $a$ |

4 | $x\lambda wy.y$ |

5 | $x\lambda$ |

6 | $\lambda \lambda xz.zx$ |

7 | $(abcd)(efgh)ij\lambda klm.mkl$ |

### Test

Valid | $\lambda x.xyz$ |

Not Valid | $\lambda x.\lambda y.$ |

Valid | $a$ |

Valid | $x\lambda wy.y$ |

Not Valid | $x\lambda$ |

Not Valid | $\lambda \lambda xz.zx$ |

Valid | $(abcd)(efgh)ij\lambda klm.mkl$ |

## Free and Bound Variables

When a variable is listed between the $\lambda$ and the dot, i.e., it is a parameter of a function, then we say that the variable is \emph{bound} to that particular $\lambda$. It is possible for there to be more than one $\lambda$-binding a variable with the same name. These are counted as two separate variables. \sidenote{Many authors forbid the use of the same variable name in more than one $\lambda$ to avoid this confusion.} In such a case, a particular variable in an expression is bound by the nearest enclosing $\lambda$. If a variable is not bound by any $\lambda$ we say that it is \emph{free}.

$\lambda x. x\ \lambda y.y$ | The $x$ is bound to the first $\lambda$. |

$(\lambda x. (\lambda x.x))\ x$ | The first $x$ is bound to the second $\lambda$, and the second $x$ |

is bound to the first $\lambda$. | |

$\lambda y. \lambda z.x$ | The variable $x$ is free. |

A free variable will not be given a value by the enclosing expression. You can think of them as being “global.”

### Question

In the following expressions, first use parenthesis to make clear the extent of each $\lambda$ abstraction, and then indicate to which $\lambda$ the variable $x$ is bound.

For example, in $\lambda x.\lambda z.x$, we could parenthesize it as $(\lambda x.(\lambda z.x))$, and the $x$ is bound to the first $\lambda$.

$\lambda x. \lambda y . x$ |

$\lambda x. \lambda x . x$ |

$\lambda x. x \lambda y . x$ |

$\lambda x. x \lambda x . x$ |

$\lambda z. x \lambda y . x$ |

$\lambda z. x \lambda x . x$ |

### Test

$(\lambda x. (\lambda y . x))$ | The first $\lambda$. |

$(\lambda x. (\lambda x . x))$ | The second $\lambda$. |

$(\lambda x. x (\lambda y . x))$ | Both are bound to the first $\lambda$. |

$(\lambda x. x (\lambda x . x))$ | The first $x$ is bound to the first $\lambda$, the second $x$ is bound to the second $\lambda$. |

$(\lambda z. x (\lambda y . x))$ | Both $x$’s are free. |

$(\lambda z. x (\lambda x . x))$ | The first $x$ is free but the second is bound to the second $\lambda$. |

## Evaluating Expressions using $\beta$-reductions

There is only one operation in $\lambda$-calculus: $\beta$-reduction (sometimes spelled as \emph{beta-reduction}). A $\beta$-reduction occurs when you have a $\lambda$ abstraction applied to another $\lambda$ term. For example, $(\lambda x.x)y$ is reducible to $y$. The terms $(\lambda x.xy)$ and $x\lambda y.y$ are not reducible. The first because there is only an abstraction, but nothing after it; the second because the $x$ is not an abstraction.

Be careful when the $\lambda$ abstraction and the term following it are both arguments to another term. In the term $x (\lambda y.y) z$, the $(\lambda y.y)$ and $z$ are both arguments to $x$. The $\lambda y.y$ is not being applied to $z$.

Here’s how to perform a $\beta$-reduction. Remove the initial $\lambda$ and its parameter. Then, in the body of that function, replace all occurrences of that variable with the argument. So, any variable that was bound the that lambda gets replaced. Mathematically, we would say

[(\lambda x.M) N \rightarrow [N/x]M]

Where $[N/x]M$ mean “replace all $x$’s in $M$ that were bound to the $lambda$ by $N$.” It might help to think of using an editor to do a search and replace.

Another notation you might see is this one:

[(\lambda x.M) N \rightarrow M[x:=N]]

It has the same meaning.

Here are a few examples of $\beta$ reductions. Sometimes one reduction is followed by another one.

\begin{tabular}{l} $(\lambda x.x)y \rightarrow y$ \ $(\lambda z.x)y \rightarrow x$ \ $(\lambda z.azbz)y \rightarrow ayby$ \ $(\lambda x.(\lambda z.z)x)y \rightarrow (\lambda z.z)y \rightarrow y$ \ $(\lambda x.x(\lambda z.ax)(\lambda x.bx))y \rightarrow y (\lambda z.ay)(\lambda x.bx)$ \ $(\lambda x.(\lambda z.zx)(\lambda x.bx))y \rightarrow (\lambda z.zy)(\lambda x.bx) \rightarrow (\lambda x.bx)y \rightarrow by$\ \end{tabular}

Try doing these reductions. Reduce each expression as much as you can.

| $(\lambda x.x x)y$ | | $(\lambda x.axxa)y$ | | $(\lambda x.(\lambda z.zx)q)y$ | | $(\lambda x.x((\lambda z.zx)(\lambda x.bx)))y$ | | $(\lambda a.a) (\lambda b.b) (\lambda c.c c) (\lambda d.d)$ |

### Test

| $(\lambda x.x x)y \rightarrow yy$ | | $(\lambda x.axxa)y \rightarrow ayya$ | | $(\lambda x.(\lambda z.zx)q)y \rightarrow (\lambda z.zy)q \rightarrow qy$ | | $(\lambda x.x((\lambda z.zx)(\lambda x.bx)))y \rightarrow y((\lambda z.zy)(\lambda x.bx)) \rightarrow y((\lambda x.bx)y) \rightarrow y(by)$ | | $(\lambda a.a) (\lambda b.b) (\lambda c.c c) (\lambda d.d) | | \rightarrow (\lambda b.b) (\lambda c.c c) (\lambda d.d) | | \rightarrow (\lambda c.c c) (\lambda d.d) | | \rightarrow (\lambda d.d)(\lambda d.d) | | \rightarrow (\lambda d.d)$ |

### A note about reduction order

You might be wondering what to do if you have a choice about which application to make. Consider this example:

$(\lambda x . x ((\lambda y .y) x)) ((\lambda a.a)(\lambda b.b))$We will be using \emph{normal order reduction}, which says that the leftmost, outermost reduction is performed first. In this case, the $\lambda x$ is applied first.

Another option is *applicative order*. In that system, the leftmost,
innermost reduction is performed first. This would be the $\lambda a$
reduction. Essentially, this is “call by value,” as the arguments to a
function are evaluated before the function is called.

For now, we are not going to emphasize the different reduction orders. Use normal order reduction for everything, i.e., don’t reduce the arguments before calling the function.

Finally, consider this example:

$(\lambda x . x ((\lambda y .y) x))$Do we do the $\lambda y$ reduction or not? In other words, do we do computation
inside of a function before the function has been called? Most languages say
```
no.'' (This is known as *weak head normal form*, but we are not
going to talk about that in this course.) If we say
```

yes,”
then we get $\lambda x.xx$, and say that the result is in *normal form*.
This is the style we will use.

## Alpha Capture and Renaming

Consider the following two $\lambda$ terms:

$X \equiv (\lambda a.\lambda b.ab) \qquad \textrm{and} Y \equiv (\lambda w.\lambda x.wx)$Clearly (we hope!) terms $X$ and $Y$ do not differ at all in meaning.
The names of the variables are different, but what these functions actually
*do* when you apply them is identical. When two terms have the same
structure and differ only in the names of the variables, they are said to
be *$\alpha$-equivalent*. If we rename the variables in a term in
such a way that the structure is preserved, is is called *$\alpha$-renaming*.

Here is an example of two terms that are **not** $\alpha$-equivalent.

Notice how the second term $Y$ applies its arguments in the reverse order compared to $X$. The structure of the terms are different, so they are not $\alpha$-equivalent.

Which of the following pairs are $\alpha$-equivalent?

$\begin{array}{ll} \lambda a. \lambda b. a b b & \lambda b. \lambda a. b a a \\ \lambda a. \lambda b. a b b & \lambda i. \lambda j. j j i \\ \lambda x. x \lambda y. x & \lambda e. e \lambda f. f \\ \lambda x. x \lambda y. x & \lambda e. e \lambda f. e \\ \end{array}$### Test

Sometimes it happens that a free variable or a variable that is bound to one
$\lambda$ ends up being moved around the expression in such a way that it gets
“captured” by another $\lambda$. This is known as *$\alpha$-capture*,
and is almost always a bad thing. Here are some examples.

The free variable $y$ has been captured by the $\lambda y$. Contrast this example, where we have $\alpha$-renamed the $\lambda y$ term to $\lambda z$:

$(\lambda x .\lambda z. z x) y \rightarrow \lambda z. z y$In general, if you are performing a reduction and find that you are going to capture a free variable, you need to $\alpha$-rename the capturing lambda. You cannot rename the free variable!

You can see that the non-capturing version has a different structure than the capturing version.

It is easy to write examples that cause capture when you use free variables, but you can do it with only bound variables too, if you reduce to normal form.

$(\lambda f . \lambda x . f x) (\lambda y.\lambda x.y) \rightarrow \lambda x. (\lambda y.\lambda x. y) x \rightarrow \lambda x. \lambda x.x$We can prevent this by $\alpha$-renaming the second term.

$(\lambda f . \lambda x . f x) (\lambda y.\lambda z.y) \rightarrow \lambda x. (\lambda y.\lambda z. y) x \rightarrow \lambda x. \lambda z.x$For this reason, it is best that you always use distinct variable names for your lambdas.

# Church Numerals

The $\lambda$-calculus doesn’t have numbers, but we can model them using functions. We need to know what a number actually is in order to model it. For our model, we will say that a number $n$ can be thought of as a potential: someday we are going to do something $n$ times. If we are going to do something, we also need something to do it to.

Therefore, a *numeral*^{1} in $\lambda$ calculus will be a function that
takes two arguments, an action and a target, and performs the action a certain
number of times to the target. These are called Church Numerals, after Alonso
Church.