Computing huge Fibonacci numbers, by proof and by code
[examples
newbies
Isabelle
Fibonacci
]
One of the key tensions in theorem proving is between reasoning about things and using them in serious computations. One way to address this tension is through computational reflection: treating logical formulas as executable code, when you can. The first realisation of this idea was the Boyer/Moore theorem prover, which was initially created to verify programs written in Pure lisp. Its assertion language was also pure Lisp, so all formulas were automatically executable. Subsequent proof assistants used much richer logics, but there always seemed to be an executable sublanguage. Then it is possible to perform computations within a proof much more efficiently than by rewriting alone. Here’s an example using Fibonacci numbers.
Evaluating Fibonacci numbers efficiently
We’ve now seen the Fibonacci numbers 0, 1, 1, 2, 3 … several times. Their traditional definition, with its two recursive calls, yields an exponential computation. That’s because, e.g., to get $F_9$ you need $F_7$ and $F_8$, and to get $F_8$ you end up computing $F_7$ a second time and so on all the way down.
\[\begin{align} F_0 &= 0\\ F_1 &= 1\\ F_{n+2} &= F_n + F_{n+1}, \end{align}\]But a linear computation is easily achieved: maintain two Fibonacci numbers at a time, since adding them yields the next one. We simply replace $(j,k)$ by $(k,j+k)$. If we start with $(F_0,F_1)$, we reach $F_n$ after $n$ iterations.
fun itfib :: "[int,int,int] ⇒ int" where "itfib n j k = (if n≤1 then k else itfib (n-1) k (j+k))"
And – just like that – we can calculate $F_{200}$ in Isabelle by rewriting alone, in under a second (0.7s):
lemma "itfib 200 0 1 = 280571172992510140037611932413038677189525" by simp
Just to be clear, Isabelle actually calculates that number itself
to verify the identity. It does so efficiently, thanks to a
symbolic binary representation of integers provided by the built-in
library. This point explains why itfib
is declared to operate on
integers (type int
) and not type nat
: the natural numbers
are built up from 0 by Suc
, the successor function,
which could easily force the entire computation to use unary notation.
You can’t go far representing numbers by a long string of Suc
s.
Proving the relationship with Fibonacci numbers
As we’ve seen before, the recursive definition above goes
directly into Isabelle. But this time, let’s define the Fibonacci function
to yield integers, so that we can compute large integers.
Its argument type is still nat
because that slightly simplifies
the definitions and proofs,
and the largest argument we’ll use is only 200.
fun fib :: "nat ⇒ int" where "fib 0 = 0" | "fib (Suc 0) = 1" | "fib (Suc (Suc n)) = fib n + fib (Suc n)"
It’s easy to prove the relationship between itfib
and fib
hinted at above. This proof isn’t difficult to find in Isabelle,
especially with the help of Sledgehammer.
It’s also written up in my book
ML for the Working Programmer.
lemma itfib_fib: "n > 0 ⟹ itfib n (fib k) (fib(k+1)) = fib (k+n)" proof (induction n arbitrary: k) case 0 then show ?case by auto next case (Suc n) have "n > 0 ⟹ itfib n (fib (k+1)) (fib k + fib (k+1)) = fib (k+n+1)" by (metis Suc.IH Suc_eq_plus1 add.commute add_Suc_right fib.simps(3)) with Suc show ?case by simp qed
(It’s on page 224. You should buy a copy, but you can also find the chapter online.)
Computing Fibonacci numbers efficiently
Higher-order logic contains a (rather ad-hoc) executable sublanguage. In particular, recursive function definitions such as those shown here are computable. The Isabelle/HOL code generator translates it into functional languages such as Standard ML and OCaml. The translation scheme maps like to like, so while it is “obviously correct”, it is not verified, and purists prefer to avoid it.
The code generation package provides a top-level command,
value, which returns the value
computed by an executable term; rewriting would return the
same value, but more slowly.
Similarly, eval
is a proof method that can replace simp
when the necessary simplification can be done by computation alone.
Let’s try value with Fibonacci numbers. Naive evaluation of $F_{44}$ takes a long time despite the use of efficient integers, because the recursion is exponential. (It returns 701408733 in about 10 seconds.)
value "fib 44"
What to do? First, delete the naive recursion rules for fib
from the code generator.
declare fib.simps [code del]
Second, provide a new way of computing fib
: a so-called
code equation (indicated by the [code]
attribute).
lemma fib_eq_itfib[code]: "fib n = (if n=0 then 0 else itfib (int n) 0 1)" using itfib_fib [of n 0] by simp
The new setup is much faster. It takes only 0.1 seconds to compute $F_{200} = 280571172992510140037611932413038677189525$.
value "fib 200"
The relevant Isabelle theory file is available here.