Ackermann's function is not primitive recursive, II
[examples
Isar
Ackermann's function
inductive definitions
]
The previous post presented the first half of the proof that Ackermann’s function $A(i,j)$ is not primitive recursive: a series of inequalities describing how the function grows with various arguments. In this post, we’ll see how to define the primitive recursive functions inductively. Using the aforementioned inequalities, it will be straightforward to prove (by induction on the construction of some PR function f) that we can always find an argument to dominate f. This celebrated result has an easy proof, and it provides a distinctive example of an inductive definition.
The primitive recursive functions
I’m assuming that you have already encountered the PR functions in some course on computation theory. If not, the Wikipedia article is an excellent overview. They are a family of $k$-ary functions over the natural numbers, for all $k$, including the following:
- The successor function $S:\mathbb{N}\to\mathbb{N}$ such that $S(x)=x+1$
- Constant functions $C^k_n:\mathbb{N}^k\to\mathbb{N}$ such that $C^k_n(x_1,\ldots,x_k)=n$
- Projection functions $P^k_i:\mathbb{N}^k\to\mathbb{N}$ such that $P^k_i(x_1,\ldots,x_k)=x_i$
Primitive recursive can be combined as follows:
- The composition $g\circ(f_1,\ldots,f_m)$ of an $m$-ary function $g$ and $k$-ary functions $f_1,\ldots,f_m$ denotes a function mapping \((x_1,\ldots,x_k) \mapsto g(f_1(x_1,\ldots ,x_k),\ldots ,f_m(x_1,\ldots ,x_k)).\)
- Primitive recursion combines the $k$-ary function $g(x_1,\ldots ,x_k)$ and the $(k + 2)$-ary function $h(y,z,x_1,\ldots ,x_k)$ to obtain the $(k+1)$-ary function $f$ defined by
Finally, and crucially, there are no primitive recursive functions other than those specified above. Our initial task is to formalise these ideas in higher-order logic.
Formalising the set of PR functions
If we regard the constructions above as a programming language, it’s absolutely minimalist, even by comparison with the pure $\lambda$-calculus. Every recursion must be bounded by a precomputed integer. Worse, the arguments in the recursion are not allowed to vary, so the common functional programming technique of accumulating arguments is impossible. Division, for example, is tricky to code. It isn’t defined in the Wikipedia article and I can’t see a better algorithm than explicit inversion of multiplication, i.e., trying ever larger “quotients” and stopping before they get too large.
But we are not defining a language at all but rather a predicate identifying those functions in $\bigcup_{k\ge0}\,\mathbb{N}^k\to\mathbb{N}$ that are primitive recursive.
Our first decision is how to formalise the tuples of arguments $(x_1,\ldots,x_k)$. Szasz, using ALF (and apparently just adding the desired rules to Martin-Löf type theory) defines a set $T(k)$ of $k$-tuples of natural numbers. Such a dependently-typed option isn’t available in Isabelle/HOL, and anyway, it seems simpler to use lists.
The base cases
Following Szasz, let’s define a version of the built-in function hd
that
returns zero for the empty list and otherwise the first element:
primrec hd0 :: "nat list ⇒ nat" where "hd0 [] = 0" | "hd0 (m # ms) = m"
Now for the successor function (which returns 1 for the empty list):
definition SC :: "nat list ⇒ nat" where "SC l = Suc (hd0 l)"
We have the functions CONSTANT n
for each n
, by currying:
definition CONSTANT :: "nat ⇒ nat list ⇒ nat" where "CONSTANT n l = n"
Projection is expressed with the help of drop
, a function to remove a given number of elements from the front of a list:
definition PROJ :: "nat ⇒ nat list ⇒ nat" where "PROJ i l = hd0 (drop i l)"
Now for a confession: here, tuples are indexed from zero and should properly be written $(x_0,\ldots,x_{k-1})$.
Operations to combine PR functions
Szasz defined the dependent type $TPR(n,m)$ to represent (the codes of) functions from $n$-tuples to $m$-tuples, i.e. functions $\mathbb{N}^n\to\mathbb{N}^m$.
By working with such tupled functions, Szasz can specify function composition
as combining elements of $TPR(k,m)$ and $TPR(n,k)$ to yield $TPR(n,m)$.
The Isabelle/HOL equivalent composes a function g
with a list fs
of functions.
Given an argument list l
, each element f
of the list fs
is applied to l
and the result is presented to g
.
I would argue that this approach is simpler.
definition COMP :: "[nat list ⇒ nat, (nat list ⇒ nat) list, nat list] ⇒ nat" where "COMP g fs l = g (map (λf. f l) fs)"
Primitive recursion itself is delegated to rec_nat
, a low-level function for recursion over the natural numbers.
Provided the argument list is nonempty, its tail (namely l
) represents the tuple $(x_1,\ldots,x_k)$ while y
ranges over the predecessors of x
and r
represents the inner recursive call.
The function g
is applied first to PREC f g y
and then to y
.
And the first line of the definition below handles the degenerate case,
because primitive recursion is actually undefined for the empty list.
fun PREC :: "[nat list ⇒ nat, nat list ⇒ nat, nat list] ⇒ nat" where "PREC f g [] = 0" | "PREC f g (x # l) = rec_nat (f l) (λy r. g (r # y # l)) x"
Special cases of evaluation
The following five claims were present in the development, but are not actually used. They simply present special cases of the definitions above in the more legible form, especially in the case of composition. The last two show the operation of primitive recursion.
lemma SC: "SC (x # l) = Suc x" by (simp add: SC_def)
lemma PROJ_0: "PROJ 0 (x # l) = x" by (simp add: PROJ_def)
lemma COMP_1: "COMP g [f] l = g [f l]" by (simp add: COMP_def)
lemma PREC_0: "PREC f g (0 # l) = f l" by simp
lemma PREC_Suc: "PREC f g (Suc x # l) = g (PREC f g (x # l) # x # l)" by auto
The actual inductive definition
Having defined all the basic functions and operations to combine them, the inductive definition itself is trivial. Several of these functions are simple enough that they could have been written inline,
but it’s convenient to have names such as SC
available separately.
It’s notable that Szasz did in fact formalise a language of PR functions, which she interpreted in a separate step.
inductive PRIMREC :: "(nat list ⇒ nat) ⇒ bool" where SC: "PRIMREC SC" | CONSTANT: "PRIMREC (CONSTANT k)" | PROJ: "PRIMREC (PROJ i)" | COMP: "PRIMREC g ⟹ ∀f ∈ set fs. PRIMREC f ⟹ PRIMREC (COMP g fs)" | PREC: "PRIMREC f ⟹ PRIMREC g ⟹ PRIMREC (PREC f g)"
This declaration causes Isabelle to define the named predicate, expressing the recursion through a fixed point construction. The given closure properties, regarded as introduction rules, are derived automatically. The corresponding induction principle, an elimination rule, Is also proved. A basic introduction and examples are available in the documentation.
For those of you interested in an abstract and theoretical development, Peter Aczel’s chapter (also here) in the Handbook of Mathematical Logic is the ultimate account.
Now for the inductive proof itself
To prove that Ackermann’s function is not primitive recursive, we show that for each PR function $f$ we can find some bound $k$, in the sense that $A(k,{-})$ grows strictly faster than $f$. To build up to this result, we work through all the ways of constructing a PR function. It’s only a matter of style that we prove these cases as separate lemmas rather than as one big induction.
The base cases
For the successor function, the desired $k$ is simply 1.
Incidentally, the function sum_list
denotes the sum of the elements of a list.
lemma SC_case: "SC l < ack 1 (sum_list l)" unfolding SC_def by (induct l) (simp_all add: le_add1 le_imp_less_Suc)
For the constant function for $n$, the desired $k$ is $n$.
lemma CONSTANT_case: "CONSTANT n l < ack n (sum_list l)" by (simp add: CONSTANT_def)
For any projection function, the desired $k$ is actually zero!
lemma PROJ_case: "PROJ i l < ack 0 (sum_list l)" proof - have "hd0 (drop i l) ≤ sum_list l" by (induct l arbitrary: i) (auto simp: drop_Cons' trans_le_add2) then show ?thesis by (simp add: PROJ_def) qed
The COMP case
The case for function composition is complicated because the function $g$ is composed with a list
of functions $f$. So we begin with a lemma that finds a single bound $k$ covering all the functions in the list,
assuming the existence of separate bounds for each individual function.
It’s a simple induction over the list. To combine the bounds, we use the lemma ack_add_bound
,
proved in the previous post.
lemma COMP_map_aux: "∀f ∈ set fs. ∃kf. ∀l. f l < ack kf (sum_list l) ⟹ ∃k. ∀l. sum_list (map (λf. f l) fs) < ack k (sum_list l)" proof (induct fs) case Nil then show ?case by auto next case (Cons a fs) then show ?case by simp (blast intro: add_less_mono ack_add_bound less_trans) qed
For function composition itself, the assumptions refer to the existence of bounds for $g$ and for each $f$.
We use the lemma above in conjunction with the inequality ack_nest_bound
.
lemma COMP_case: assumes 1: "∀l. g l < ack kg (sum_list l)" and 2: "∀f ∈ set fs. ∃kf. ∀l. f l < ack kf (sum_list l)" shows "∃k. ∀l. COMP g fs l < ack k (sum_list l)" unfolding COMP_def using 1 COMP_map_aux [OF 2] by (meson ack_less_mono2 ack_nest_bound less_trans)
The PREC case
Primitive recursion itself has the most complicated proof. We assume a nonempty argument list
(the other case is degenerate) and use induction on the actual integer on which the recursion is done.
The unusual form of the induction statement (adding sum_list l
on the left) allows the relatively simple proof below.
lemma PREC_case_aux: assumes f: "⋀l. f l + sum_list l < ack kf (sum_list l)" and g: "⋀l. g l + sum_list l < ack kg (sum_list l)" shows "PREC f g (m#l) + sum_list (m#l) < ack (Suc (kf + kg)) (sum_list (m#l))" proof (induct m) case 0 then show ?case using ack_less_mono1_aux f less_trans by fastforce next case (Suc m) let ?r = "PREC f g (m#l)" have "¬ g (?r # m # l) + sum_list (?r # m # l) < g (?r # m # l) + (m + sum_list l)" by force then have "g (?r # m # l) + (m + sum_list l) < ack kg (sum_list (?r # m # l))" by (meson g leI less_le_trans) moreover have "… < ack (kf + kg) (ack (Suc (kf + kg)) (m + sum_list l))" using Suc.hyps by simp (meson ack_le_mono1 ack_less_mono2 le_add2 le_less_trans) ultimately show ?case by auto qed
The previous result is generalised to cover the degenerate case of an empty argument list.
lemma PREC_case_aux': assumes f: "⋀l. f l + sum_list l < ack kf (sum_list l)" and g: "⋀l. g l + sum_list l < ack kg (sum_list l)" shows "PREC f g l + sum_list l < ack (Suc (kf + kg)) (sum_list l)" by (smt (verit, best) PREC.elims PREC_case_aux add.commute add.right_neutral f g less_ack2)
To obtain the PR case in the form we need for the main induction, other inequalities involving Ackermann’s function are brought in.
proposition PREC_case: "⟦⋀l. f l < ack kf (sum_list l); ⋀l. g l < ack kg (sum_list l)⟧ ⟹ ∃k. ∀l. PREC f g l < ack k (sum_list l)" by (metis le_less_trans [OF le_add1 PREC_case_aux'] ack_add_bound2)
The main result
The result is by induction on the construction of a given primitive recursive function $f$, but the machine proof is now trivial because we have proved all the cases of the induction above.
lemma ack_bounds_PRIMREC: "PRIMREC f ⟹ ∃k. ∀l. f l < ack k (sum_list l)" by (erule PRIMREC.induct) (blast intro: SC_case CONSTANT_case PROJ_case COMP_case PREC_case)+
The actual function that is not primitive recursive is Ackermann’s function along its own diagonal.
theorem ack_not_PRIMREC: "¬ PRIMREC (λl. ack (hd0 l) (hd0 l))" proof assume *: "PRIMREC (λl. ack (hd0 l) (hd0 l))" then obtain m where m: "⋀l. ack (hd0 l) (hd0 l) < ack m (sum_list l)" using ack_bounds_PRIMREC by blast show False using m [of "[m]"] by simp qed
The full development can be found in Isabelle’s Archive of Formal Proofs. I tidied it considerably in the course of writing this blog post. The new version will remain hidden in the development branch of the AFP until Isabelle 2022 is released, around October. Although the theorem is remarkable and deep, it’s formal proof is concise: a mere 300 lines.
By the way, if you are looking for a function that is not primitive recursive and has a practical application, the answer is, any programming language interpreter. An interpreter takes a source program (encoded somehow) and executes it, so it can easily run forever. PR functions necessarily terminate. And an interpreter for (a programming language of) PR functions will always terminate, because PR functions always terminate. It cannot itself be PR (by the obvious diagonalisation argument).