Formalising an easy proof: Dirichlet's approximation theorem
[examples
Isar
formalised mathematics
]
For many, the process of transforming a textbook mathematical proof into a formal document remains mysterious. Here, we look at a straightforward example. Dirichlet’s approximation theorem states that each real number has a rational approximation with a relatively small denominator. The proof combines an application of the pigeonhole principle with some simple calculations. Formalised, it’s only about 50 lines long. There are still a couple of tricky bits to deal with!
The textbook proof
Consider the problem of approximating π by a rational number. Any finite decimal representation of π is rational, but the denominators are large powers of 10 when we can do much better: 355/113 = 3.1415929 approximates π to seven significant figures (the true value is 3.14159265). Dirichlet’s approximation theorem says that given the real number $\theta$, for every integer $N>0$ there exist integers $h$ and $k$ with $0<k \le N$ such that $\vert k\theta-h\vert < 1/N$. Thus, $h/k$ is a good approximation to $\theta$.
Here is the proof, which comes from Modular Functions and Dirichlet Series in Number Theory by Tom M. Apostol, page 143.
Starting the formalisation
We begin with a precise formation of the theorem statement.
To avoid possible issues involving numeric types,
the types of θ and $N$ are given explicitly.
Some type coercions are also explicit: int
, which injects from type nat
to int
,
and of_int
, which injects from type int
into any numeric type that contains the integers,
in this case real
.
theorem Dirichlet_approx: fixes θ::real and N::nat assumes "N > 0" obtains h k where "0 < k" "k ≤ int N" "¦of_int k * θ - of_int h¦ < 1/N" proof -
Next, we define $X$ to be the set of $N+1$ real numbers mentioned at the start of the proof, while $Y$ denotes the division of the half-open unit interval into $N$ parts. Note that the image of a function is written in Isabelle/HOL using the ` operator, and the formal syntax for integer ranges is also helpful. Many beginners make the mistake of repeating large expressions rather than creating an abbreviation, especially when the source proof did not use an abbreviation. As we reason about $X$ and $Y$ below it should become clear that they make the formal text more readable. They make automation work better too.
define X where "X ≡ (λk. frac (k*θ)) ` {..N}" define Y where "Y ≡ (λk::nat. {k/N..< Suc k/N}) ` {..<N}"
Applying the pigeonhole principle
Next we need to prove that “some subinterval must contain at least two of these fractional parts”. Proof by contradiction is not mentioned, but I think is the easiest way to proceed.
have False if non: "¬ (∃b≤N. ∃a<b. ¦frac (real b * θ) - frac (real a * θ)¦ < 1/N)" proof -
The point is, it’s not immediately obvious that
“the $N+1$ real numbers $0$, $\{\theta\}$, $\{2\theta\},\ldots,\{N\theta\}$”
really are distinct. But the claim follows immediately from non
, the negated assumption.
We prove that the map $k\mapsto\{k\theta\}$ is injective,
and therefore that $X$ has the desired cardinality.
have "inj_on (λk::nat. frac (k*θ)) {..N}" using non by (intro linorder_inj_onI; force) then have caX: "card X = Suc N" by (simp add: X_def card_image)
It shouldn’t be difficult to prove that the cardinality of $Y$ is $N$, but for the pigeonhole argument we merely need $|Y|\le N$, which is trivial by definition.
have caY: "card Y ≤ N" "finite Y" unfolding Y_def using card_image_le by force+
To set up the pigeonhole argument, we define the function $f$. It is clearly a map from the half-open unit interval to $Y$, as can be seen from the result we immediately prove.
define f where "f ≡ λx::real. let k = nat ⌊x * N⌋ in {k/N ..< Suc k/N}" have "nat ⌊x * N⌋ < N" if "0 ≤ x" "x < 1" for x::real using that assms floor_less_iff nat_less_iff by fastforce
We actually show that $f$ maps $X$ to $Y$, and hence is not injective: that would imply $N+1\le N$.
then have "f ∈ X → Y" by (force simp: f_def Let_def X_def Y_def frac_lt_1) then have "¬ inj_on f X" using ‹finite Y› caX caY card_inj by fastforce
Therefore some pigeonhole holds two elements $x$, $x’$ of $X$, which we can write as $\{c\theta\}$, $\{d\theta\}$, for natural numbers $c$, $d$.
then obtain x x' where "x≠x'" "x ∈ X" "x' ∈ X" and eq: "f x = f x'" by (auto simp: inj_on_def) then obtain c d::nat where c: "c ≠ d" "c ≤ N" "d ≤ N" and xeq: "x = frac (c*θ)" and xeq': "x' = frac (d*θ)" by (metis (no_types, lifting) X_def atMost_iff image_iff)
To conclude the proof by contradiction, we have to exhibit the actual pigeonhole and do a few calculations. It turns out to be the half open interval $[\frac{i}{N},\frac{i+1}{N})$, where $i = \lfloor x N\rfloor$.
define i where "i ≡ nat ⌊x * N⌋" then have i: "x ∈ {i/N ..< Suc i/N}" using assms by (auto simp: divide_simps xeq) linarith have i': "x' ∈ {i/N ..< Suc i/N}" using eq assms unfolding f_def Let_def xeq' i_def by (simp add: divide_simps) linarith
A further small calculation shows that we have indeed contradicted the assumption
non
that we made above.
with assms i have "¦frac (d*θ) - frac (c*θ)¦ < 1 / N" by (simp add: xeq xeq' abs_if add_divide_distrib) with c show False by (metis abs_minus_commute nat_neq_iff non) qed
Finishing up
Thanks to the just-completed pigeonhole argument, we obtain natural numbers $a$ and $b$ satisfying claim (2) of the textbook proof and are ready to define the sought-for $k$ and $h$. Here ?k is a different sort of abbreviation, essentially a macro and appropriate in this simple case.
then obtain a b::nat where *: "a<b" "b≤N" "¦frac (b*θ) - frac (a*θ)¦ < 1/N" by metis let ?k = "b-a" and ?h = "⌊b*θ⌋ - ⌊a*θ⌋"
We are ready for the final calculation. Because the theorem was stated using obtains, the proof requires showing that our ?k and ?h satisfy the three inequalities given in the theorem statement. We only do the last one explicitly, mopping up the other two within the qed.
show ?thesis proof have "frac (b*θ) - frac (a*θ) = ?k*θ - ?h" using ‹a < b› by (simp add: frac_def left_diff_distrib of_nat_diff) with * show "¦of_int ?k * θ - ?h¦ < 1/N" by (metis of_int_of_nat_eq) qed (use * in auto) qed
There is no magic formula for coming up with such proofs. Do pay attention to the structure given in the text and try to reproduce it using the Isar structured proof language. The individual claims are not that complicated and in many places were proved by sledgehammer. Still, you need to master the isabelle/HOL syntax for formulas, including some of the more offbeat things shown here.
The relevant Isabelle theory file is available here. A generalisation of the proof to demonstrate the simultaneous approximation of finitely many real numbers is proved in the distribution.