Formalising Gödel's incompleteness theorems, I

18 May 2022

[ incompleteness nominal package Archive of Formal Proofs Kurt Gödel ]

Gödel’s incompleteness theorems state limits on formal systems. (1) A consistent system strong enough to express the basic properties of integer addition and multiplication must be incomplete: there exists a formula that is neither provable nor refutable within the system, and (2) no such formal system can prove its own consistency. The first theorem is proved by using integer arithmetic to encode logical formulas, operations on them such as substitution, and inference according to the rules of the formal system. A fixedpoint construction yields an explicit formula expressing its own unprovability. The technical complications of the first theorem are formidable but were overcome already by Shankar in the 1980s and again by John Harrison and Russell O’Connor. This post introduces my own formalisation, using Isabelle/HOL. It also demonstrates formalising syntax involving variable binding using the nominal package of Christian Urban and Stefan Berghofer. More generally, it illustrates how to specify the syntax, semantics and proof theory of a formal system.

Remarks about the informal proof

One difficulty with formalising incompleteness is purely technical: much of the reasoning in the proof is straightforward mathematically, but has to be carried out within the given formal calculus. We’ve already seen how hard it is to prove obvious things in a theorem prover, despite all its automation; now imagine proving those things in a raw formal calculus, itself nothing more than a data structure formalised in a theorem prover. So here is a spoiler: such proofs were typically hundreds of lines long. I’ve written a second paper that comments extensively on the length of each component of the development.

My formalisation follows a development by Świerczkowski. He gave a no-handwaving informal proof, a gift for anyone who might come along later to formalise it. He wrote out many details glossed over in textbooks. He made strategic decisions to minimise the effort needed to reach even the second incompleteness theorem, which had been regarded by many as unattainable.

Świerczkowski chose to rely on the hereditarily finite sets rather than the integers as the basis for coding. Decoding $2^x3^y$ requires the fundamental theorem of arithmetic; an alternative coding option needs the Chinese remainder theorem and neither is tempting to formalise in an internalised first-order calculus. The set theoretic treatment of ordered pairs as $\{\{x\},\{x,y\}\}$ is infinitely simpler. He also proved a meta-theorem stating that every true Σ-formula is provable in the calculus with no need to write out the proofs. A Σ-formula can begin with any number of existential quantifiers, and they are sufficient to express much of the logic of coding. The standard approach yields a more powerful meta-theorem (where also false formulas have explicit disproofs), but it requires all quantifiers to be bounded, and ultimately requiring more work than just writing out some formal proofs.

The stages of the proofs of the first theorem are as follows:

Formalisation of the internal calculus, HF
Meta-theorem stating that every true Σ-sentence is provable
Defining a coding system for all HF terms and formulas
Defining predicates within HF itself to recognise terms, formulas and operations such as substitution; then inference rules and provability itself
Exhibiting the actual undecidable HF formula

On the treatment of bound variables

Formal reasoning about syntax including variable binding is generally fraught with difficulties connected with substitution and variable capture. In Isabelle/HOL we are lucky to have the nominal package, created by Christian Urban and based on theoretical work by Andrew Pitts and Jamie Gabbay. The nominal approach to variable binding provides a calculus of permutations on variable names, and provides a smooth treatment of syntactic operations that treat bound variables appropriately (which in particular means that all results are independent of which names are chosen for the bound variables). It precisely defines the notion of a variable being fresh and gives you a means of picking fresh variables. You get to assume that variables are magically renamed whenever necessary.

My formal development of the incompleteness theorems uses the nominal approach in formalising the logical calculus: its syntax, syntactic operations and inference rules. When it comes to coding formulas of the calculus, we need a different approach to variable binding, as attempting to formalise the nominal approach within the formal calculus itself is not to be imagined. Although Swierczkowski used plain variable names, I felt certain that a nameless representation would work better, and the obvious one is de Bruijn’s (explanation on Wikipedia).

The proof requires proving that the encoded operations carry out their intended effect. Happily, there’s a simple correspondence between syntax and operations formalised using the nominal approach and their counterparts using de Bruijn indices.

A formal logic and its Isabelle/HOL formalisation

Now let’s see a few highlights of the Isabelle formalisation of incompleteness. A bit of magic (omitted here) sets up the nominal package and creates the type name to serve as a type of variable names. The nominal package provide its own datatype declaration facility. We can now declare a type for the terms of our formalism. Terms can be variables, 0 or “eats” ($A \lhd x$ for the set whose elements are those of $A$, plus $x$).

nominal_datatype tm = Zero | Var name | Eats tm tm

The formulas provide a bare bones predicate calculus, able to express $x\in y$, $x=y$, $\phi\lor\psi$, $\neg \phi$ and $\exists x.\, \phi$. The phrase binds x in f indicates that the occurrence of x is binding.

nominal_datatype fm =
    Mem tm tm    (infixr "IN" 150)
  | Eq tm tm     (infixr "EQ" 150)
  | Disj fm fm   (infixr "OR" 130)
  | Neg fm
  | Ex x::name f::fm binds x in f

This is all we need to define the syntax of our first-order calculus. The next steps will define substitution (necessary to express the rules of inference) and the semantics.

Defining substitution

Substitution of a term for a variable in another term is trivial. It has no effect on 0; a variable is replaced by the new term if it matches; compound terms (involving $\lhd$) are substituted recursively. The last line proves that the function uses names legitimately. It is the last such proof we are going to see: they become gruesome, which is the only real drawback of the nominal package.

nominal_function subst :: "name ⇒ tm ⇒ tm ⇒ tm"
  where
   "subst i x Zero       = Zero"
 | "subst i x (Var k)    = (if i=k then x else Var k)"
 | "subst i x (Eats t u) = Eats (subst i x t) (subst i x u)"
by (auto simp: eqvt_def subst_graph_aux_def) (metis tm.strong_exhaust)

Substitution over formulas is also pretty straightforward. In most cases it is simply built up recursively. The line for the existential quantifier says in effect, ensure that the quantified variable is fresh with respect to the variable and term of the substitution, then simply substitute recursively. We can read this as an order to rename the bound variable appropriately to prevent a name clash, and we don’t actually care what name is chosen.

nominal_function  subst_fm :: "fm ⇒ name ⇒ tm ⇒ fm" ("_'(_::=_')" [1000, 0, 0] 200)
  where
    Mem:  "(Mem t u)(i::=x)  = Mem (subst i x t) (subst i x u)"
  | Eq:   "(Eq t u)(i::=x)   = Eq  (subst i x t) (subst i x u)"
  | Disj: "(Disj A B)(i::=x) = Disj (A(i::=x)) (B(i::=x))"
  | Neg:  "(Neg A)(i::=x)    = Neg  (A(i::=x))"
  | Ex:   "atom j ♯ (i, x) ⟹ (Ex j A)(i::=x) = Ex j (A(i::=x))"

The condition atom j ♯ (i, x) in the last line above requires an explanation. Crucial to the nominal approach is the idea of a variable being fresh for a given expression, which roughly means that it is not free in that expression: $a\mathbin{\sharp} E$ means “$a$ is fresh for $E$”. The condition in the last line is that j, the quantified variable, must be fresh for both i, the variable being substituted, and x, the replacement term.

Now let’s prove that if a variable is fresh for a term, then substitution has no effect. Its proof is one line: “induction then simplify”. The analogous theorem for formulas has an equally simple proof, and in particular, it tells us that (Ex i A)(i::=x) = Ex i A.

lemma forget_subst_tm [simp]: "atom a ♯ tm ⟹ subst a x tm = tm"
  by (induct tm rule: tm.induct) (simp_all add: fresh_at_base)

The following little result states that two successive substitutions within a formula are equivalent to a single substitution on the formula, the other substitution taking place in the term. The proof is another one-line induction, where the “avoiding” clause informs the nominal package of the syntactic entities that quantified bound variables must avoid.

lemma subst_fm_commute[simp]:
  "atom j ♯ A ⟹ (A(i::=t))(j::=u) = A(i ::= subst j u t)"
  by (nominal_induct A avoiding: i j t u rule: fm.strong_induct) (auto simp: fresh_at_base)

This sort of proof can be absolutely fiendish with other approaches to variable binding. Nominal requires some effort to justify a function definition, but in return it makes reasoning about the function really easy.

Formal semantics of the calculus

The formal semantics is defined in terms of the existing development of hereditarily finite sets. Variables are interpreted with respect to an environment, a finite function mapping names to hf sets. The corresponding AFP entry is among the most heavily used in the entire Archive.

As before, the definition for terms has a trivial justification (omitted anyway). The semantics of a term map the HF constructors (Zero and Eats) to the corresponding operators, while the meaning of a variable is looked up in the environment.

nominal_function eval_tm :: "(name, hf) finfun ⇒ tm ⇒ hf"
  where
   "eval_tm e Zero = 0"
 | "eval_tm e (Var k) = finfun_apply e k"
 | "eval_tm e (Eats t u) = eval_tm e t ◃ eval_tm e u"

An omitted bit of magic allows us to write the semantics of a term as ⟦t⟧e instead of eval_tm e t. Now for formulas: given an environment, the semantics of the formula of our calculus is a Boolean. It is a standard Tarski truth definition, in effect an embedding of our calculus into higher-order logic.

nominal_function eval_fm :: "(name, hf) finfun ⇒ fm ⇒ bool"
  where
   "eval_fm e (t IN u) ⟷ ⟦t⟧e ❙∈ ⟦u⟧e"
 | "eval_fm e (t EQ u) ⟷ ⟦t⟧e = ⟦u⟧e"
 | "eval_fm e (A OR B) ⟷ eval_fm e A ∨ eval_fm e B"
 | "eval_fm e (Neg A) ⟷ (~ eval_fm e A)"
 | "atom k ♯ e ⟹ eval_fm e (Ex k A) ⟷ (∃x. eval_fm (finfun_update e k x) A)"

Omitted once again is an ugly proof that the function definition is legitimate. The only difficult case refers to the last line above, which the semantics of a quantified formula provided the bound variable is fresh in the environment. It is straightforward to prove that the last line in fact holds unconditionally.

Now for some proofs. And once again, proofs about the functions just defined are simple. This one says that the semantics of a term t is unaffected if we update the environment at a variable that is fresh for t.

lemma forget_eval_tm [simp]: "atom i ♯ t ⟹  ⟦t⟧(finfun_update e i x) = ⟦t⟧e"
  by (induct t rule: tm.induct) (simp_all add: fresh_at_base)

This lemma is the analogous result for formulas. The proof is once again “induction, then simplify”.

lemma forget_eval_fm [simp]:
   "atom k ♯ A ⟹ eval_fm (finfun_update e k x) A = eval_fm e A"
  by (nominal_induct A avoiding: k e rule: fm.strong_induct)
     (simp_all add: fresh_at_base finfun_update_twist)

The following two lemmas relate syntax with semantics: the effect of syntactic substitution is identical to that of updating the environment with the semantics of the substituted term.

lemma eval_subst_tm</span>: "⟦subst i t u⟧e = ⟦u⟧(finfun_update e i ⟦t⟧e)"
  by (induct u rule: tm.induct) (auto)

And the same thing for formulas.

lemma eval_subst_fm: "eval_fm e (fm(i::= t)) = eval_fm (finfun_update e i ⟦t⟧e) fm"
  by (nominal_induct fm avoiding: i t e rule: fm.strong_induct)
     (simp_all add: eval_subst_tm finfun_update_twist fresh_at_base)

Nobody should imagine that such simple proofs could be possible in any approach based on naïve variable names.

The inference system

HF, the internal calculus, is defined by a Hilbert system. HF formulas have only disjunctions, negations and existential quantifiers, so the missing connectives such as conjunctions and universal quantifiers must be defined as the obvious abbreviations. For Boolean logic, the proof system incorporates the following fairly arbitrary set of axiom schemes. I defined it inductively for convenience, although there is no recursion.

inductive_set boolean_axioms :: "fm set"
  where
    Ident:     "A IMP A ∈ boolean_axioms"
  | DisjI1:    "A IMP (A OR B) ∈ boolean_axioms"
  | DisjCont:  "(A OR A) IMP A ∈ boolean_axioms"
  | DisjAssoc: "(A OR (B OR C)) IMP ((A OR B) OR C) ∈ boolean_axioms"
  | DisjConj:  "(C OR A) IMP (((Neg C) OR B) IMP (A OR B)) ∈ boolean_axioms"

The scheme of “special axioms” defines existential quantification. In standard notation, these axioms have the form $\phi(t)\to \exists x.\,\phi(x)$, where $t$ is any term.

inductive_set special_axioms :: "fm set" where
  I: "A(i::=x) IMP (Ex i A) ∈ special_axioms"

The induction axioms include every instance of the induction scheme for HF sets. In standard notation, these axioms have the form $\phi(0) \land \forall xy\, [\phi(x)\land\phi(y)\to\phi(x\lhd y)]\to \forall x\,\phi(x)$.

inductive_set induction_axioms :: "fm set" where
  ind:
  "atom (j::name) ♯ (i,A)
   ⟹ A(i::=Zero) IMP ((All i (All j (A IMP (A(i::= Var j) IMP A(i::= Eats(Var i)(Var j))))))
      IMP (All i A))
    ∈ induction_axioms"

Further axioms (omitted) describe the properties of the HF operators Zero and Eats; there are also some standard equality axioms. Finally, there’s an unspecified extra axiom, standing for any else finite statement we wish to assume. The extra axiom is required to be true in the semantics, and all the other axioms are proved to hold, so this proof calculus will be consistent by construction.

We are finally ready to behold the inference system itself. Because it is a Hilbert system, $\Gamma\vdash\phi$ means that $\phi$ follows from the given set of assumptions, $\Gamma$. The first line is the trivial inclusion and the next six lines state that the various axioms are theorems.

inductive hfthm :: "fm set ⇒ fm ⇒ bool" (infixl "⊢" 55)
  where
    Hyp:    "A ∈ H ⟹ H ⊢ A"
  | Extra:  "H ⊢ extra_axiom"
  | Bool:   "A ∈ boolean_axioms ⟹ H ⊢ A"
  | Eq:     "A ∈ equality_axioms ⟹ H ⊢ A"
  | Spec:   "A ∈ special_axioms ⟹ H ⊢ A"
  | HF:     "A ∈ HF_axioms ⟹ H ⊢ A"
  | Ind:    "A ∈ induction_axioms ⟹ H ⊢ A"
  | MP:     "H ⊢ A IMP B ⟹ H' ⊢ A ⟹ H ∪ H' ⊢ B"
  | Exists: "H ⊢ A IMP B ⟹ atom i ♯ B ⟹ ∀C ∈ H. atom i ♯ C ⟹ H ⊢ (Ex i A) IMP B"

The last two lines above are inference rules:

modus ponens: from $\phi\to\psi$ and $\phi$ infer $\psi$
so-called $\exists$-introduction: from $\phi\to\psi$ infer $(\exists x.\,\phi)\to\psi$ provided $x$ is fresh

It is now easy to prove that the internal calculus is sound. The proof is a straightforward induction, referring to (omitted) proofs that the axioms are true in the semantics. The calculus is therefore consistent. This development of incompleteness differs from many others, which typically assume a more abstract calculus with consistency requirements.

theorem hfthm_sound: assumes "H ⊢ A" shows "(∀B∈H. eval_fm e B) ⟹ eval_fm e A"
using assms
proof (induct arbitrary: e)
  case (Hyp A H) thus ?case
    by auto
next
  case (Extra H) thus ?case
    by (metis extra_axiom_holds)
next
  case (Bool A H) thus ?case
    by (metis boolean_axioms_hold)
next
  case (Eq A H) thus ?case
    by (metis equality_axioms_hold)
next
  case (Spec A H) thus ?case
    by (metis special_axioms_hold)
next
  case (HF A H) thus ?case
    by (metis HF_axioms_hold)
next
  case (Ind A H) thus ?case
    by (metis induction_axioms_hold)
next
  case (MP H A B H') thus ?case
    by auto
next
  case (Exists H A B i e) thus ?case
    by auto (metis forget_eval_fm)
qed

Proving the deduction theorem

We now have a sound Hilbert system, but it would be extremely inconvenient for conducting actual proofs, which we shall have to do. A substantial amount of largely routine work is necessary to derive from it a sort of sequent calculus, which will allow a little bit of automation and sane-looking, if lengthy, proofs.

The only nontrivial step in this derivation is proving the deduction theorem, which describes the relationship between assumptions and implication. Precisely, it says that any assumption can be made explicit as an implication. The full proof is given below (though referring to some omitted lemmas). It’s another perfectly straightforward induction. Even the quantifier case is simple enough.

lemma deduction_Diff: assumes "H ⊢ B" shows "H - {C} ⊢ C IMP B"
using assms
proof (induct)
  case (Hyp A H) thus ?case
    by (metis Bool Imp_triv_I boolean_axioms.Ident hfthm.Hyp member_remove remove_def)
next
  case (Extra H) thus ?case
    by (metis Imp_triv_I hfthm.Extra)
next
  case (Bool A H) thus ?case
    by (metis Imp_triv_I hfthm.Bool)
next
  case (Eq A H) thus ?case
    by (metis Imp_triv_I hfthm.Eq)
next
  case (Spec A H) thus ?case
    by (metis Imp_triv_I hfthm.Spec)
next
  case (HF A H) thus ?case
    by (metis Imp_triv_I hfthm.HF)
next
  case (Ind A H) thus ?case
    by (metis Imp_triv_I hfthm.Ind)
next
  case (MP H A B H')
  hence "(H-{C}) ∪ (H'-{C}) ⊢ Imp C B"
    by (simp add: S)
  thus ?case
    by (metis Un_Diff)
next
  case (Exists H A B i) show ?case
  proof (cases "C ∈ H")
    case True
    hence "atom i ♯ C" using Exists by auto
    moreover have "H - {C} ⊢ A IMP C IMP B" using Exists
      by (metis Imp_Imp_commute)
    ultimately have "H - {C} ⊢ (Ex i A) IMP C IMP B" using Exists
      by (metis fm.fresh(3) fm.fresh(4) hfthm.Exists member_remove remove_def)
    thus ?thesis
      by (metis Imp_Imp_commute)
  next
    case False
    hence "H - {C} = H" by auto
    thus ?thesis using Exists
      by (metis Imp_triv_I hfthm.Exists)
  qed
qed

That’s surely enough for now. More next week!

Machine Logic

At the junction of computation, logic and mathematics