Introduction to the λcalculus
[general
logic
lambda calculus
]
The λcalculus has had a pervasive impact on computer science – and not just in functional programming languages, most of which can be viewed as extended λcalculi, but even in modern systems programming languages such as Rust. Polymorphic type checking emerged in the context of the typed λcalculus and now exists (to some extent) in many modern languages. Turning to theory, denotational semantics is normally expressed within a form of λcalculus. Both higherorder logic and dependent type theories are extensions of the λcalculus. But it is a mysterious beast: let’s take a look.
Syntax of the λcalculus
Strictly speaking, syntax is all there is: that’s why it is called a calculus as opposed to a theory. You can express “functions”, but the λcalculus is not about functions in any real sense. Dana Scott has written at length about how hard it is to relate the λcalculus to any natural mathematical objects, especially because a function can be applied to itself.^{1}
A λterm is defined inductively to be one of the following, where $M$ and $N$ are already existing λterms:
 $x$, $y$, … are the variables: we assume infinitely many to be available.
 $(\lambda x.M)$ is the abstraction of $M$ over the variable $x$.
 $(M N)$ is the combination of $M$ and $N$.
This may seem too simple. Where are the basic values like numbers and booleans? Believe it or not, they can be encoded, even advanced data structures such as infinite lists. Why can functions take only one argument? Frege and Schönfinkel had both noticed that multiple arguments are formally unnecessary: a technique now known as currying (according to one rumour, because Dana Scott was fond of curry). Our tiny formal calculus is sufficient to encode the full world of computation.
Some parentheses can be omitted. We can write $\lambda xyz.M$ instead of $(\lambda x.(\lambda y.(\lambda z.M)))$ and $MNPQ$ instead of $(((MN)P)Q)$. We can write $\lambda x.MN$ instead of $\lambda x.(MN)$.
These conventions are particularly suitable for curried functions. But please note that the traditional function notation $f(x,y,z)$ is absolutely unavailable to us.
A bit of theory for the λcalculus
Church had originally intended the λcalculus to be a form of predicate logic, but his conception became one of the victims of Russell’s paradox. (He was finally successful, later, using a typed λcalculus.) And in the beginning, Church insisted that $\lambda x.M$ should only be allowed if $x$ was actually used in $M$: he did not like the idea of a function that simply ignored its argument. Such a restriction is barely imaginable today, just as our programming languages no longer insist that a forloop must perform at least one iteration.
What does it mean for a variable to be “used”? For that, we need the notions of bound and free variables. In $\lambda x.M$, all occurrences of the variable $x$ are bound. Variable occurrences that are not bound are said to be free, such as the $y$ in $\lambda x.xy$ or even the first $x$ in $x(\lambda x.x)$. To be precise, let’s define the sets BV and FV:
\[\newcommand\BV{\mathop{\rm BV}} \newcommand\FV{\mathop{\rm FV}} \begin{align*} \BV(x) & = \emptyset \\ \BV(\lambda x.M) & = \BV(M) \cup \{x\} \\ \BV(M N) & = \BV(M) \cup \BV(N) \end{align*}\] \[\begin{align*} \FV(x) & = \{x\} \\ \FV(\lambda x.M) & = \FV(M) \backslash \{x\} \\ \FV(M N) & = \FV(M) \cup \FV(N) \end{align*}\]Computation steps in the λcalculus work by substitution. Specifically, we substitute for free occurrences of some variable $y$ in the term $M$. That’s because a λbinding introduces a nested scope: the λcalculus has given us another basic programming language concept: local variables.
The operation of substituting $L$ for all free occurrences of $y$ in $M$ is written $M[L/y]$ and is defined as follows:
 For a variable: $x[L/y]$ is $L$ if $x=y$, otherwise $x$.
 For an abstraction: $(\lambda x.M)[L/y]$ is simply $\lambda x.M$ if $x=y$, otherwise $\lambda x.M[L/y]$
 For a combination: $(M N)[L/y]$ is done recursively as $M[L/y]\; N[L/y]$.
There is enough λcalculus theory to fill a large book.
Using the theoretical notions defined above, we can define computation: the famous λconversions.
Computation in the λcalculus
The 1930s saw growing interest in the concept of a computable function. Alan Turing’s paper on computable numbers appeared in 1936 and Church was working on related questions at the same time. The ChurchTuring thesis, which identifies the computable functions as those computable using a Turing machine, or (equivalently!) the λcalculus, dates from that year.
However, Church’s early writings do not stress computability. Here is an excerpt from the introduction to his monograph on the λcalculus.
A function is a rule of correspondence by which when anything is given (as argument) another thing (the value of the function for that argument) may be obtained. That is, a function is an operation which may be applied on one thing (the argument) to yield another thing (the value of the function). … In particular it is not excluded that one of the elements of the range of arguments of a function f should be the function f itself. ^{2}
The most basic rule of the $\lambda$calculus is βreduction, which replaces $(\lambda x. M)N$ by $M[N/x]$: a copy of $M$ in which every free occurrence of $x$ has been replaced by $N$. It is only valid provided that substitution does not capture a free variable within a λbinding, for which $\BV(M)\cap\FV(N)=\emptyset$ is a sufficient condition.
Here are two examples of βreduction:
 $(\lambda x.(xx))(yz) \to_\beta ((yz)(yz))$
 $((\lambda z.(zy))(\lambda x.x)) \to_\beta ((\lambda x.x)y) \to_\beta y$
Strangely enough, this natural rule is not enough to cover Church’s conception of a function. For one thing, due to the risk of variable capture, we need the ability to rename bound variables.
The αconversion $(\lambda x.M) \to_\alpha (\lambda y.M[y/x])$ renames the abstraction’s bound variable from $x$ to $y$. It is valid provided $y$ does not occur (free or bound) in $M$. For example, $(\lambda x.(xz)) \to_\alpha (\lambda y.(yz))$.
We are still not finished. For suppose that $M$ and $M’$ enjoy the property that $MN=M’N$ for all $N$. If we really regard them as rules of correspondence, and they behave identically, then we must insist that $M=M’$. (Note that we have not yet defined equality, but rather are trying to do so now.) This fundamental principle is sometimes called extensionality of functions.^{3}
The ηreduction $(\lambda x.(Mx)) \to_\eta M$ replaces the trivial function $\lambda x.Mx$ — where $M$ does not depend on $x$ — by $M$. Formally, the condition above is $x\not\in\FV(M)$: the abstraction does nothing but apply $M$ to its argument. For example, $(\lambda x.((zy)x)) \to_\eta zy$.
We write $M=N$ if it is possible to transform $M$ into $N$ through any series of α, β or η steps, possibly within a term (not just at top level) and possibly backwards (βexpansion and ηexpansion). Most of the desirable properties of equality can be shown to hold quite easily. It is an equivalence relation and has the natural substitutivity properties, for example, if $M’=M$ then $M’N=MN$.
As a small demonstration, here is a proof that η implies extensionality. Suppose we have $M$ and $M’$ such that $MN=M’N$ for all $N$. Then pick a fresh variable $x$ (that is, $x\not\in\FV(MM’)$. We have $Mx=M’x$, hence $\lambda x.Mx = \lambda x. M’x$ (by substitutivity) and hence $M=M’$ (by two ηreductions).
One thing is still missing: a consistency proof. A theory where $M=N$ for all terms $M$ and $N$ is useless. That equality in the λcalculus is nontrivial was proved by Church and Rosser, in a saga both protracted and painful.^{4}

Dana Scott. Lambda Calculus: Some Models, Some Philosophy. In: J. Barwise, H. J. Keisler and K. Kunen, eds., The Kleene Symposium (NorthHolland, 1980), 258 ↩

Alonzo Church. The Calculi of LambdaConversion. Princeton University Press, 1941 ↩

Extensionality fails in most dependent type theories: a serious limitation. ↩

J. Barkley Rosser. Highlights of the History of the LambdaCalculus. ↩