## A classical proof: exponentials are irrational

16 Feb 2022

In Proofs from THE BOOK, Aigner and Ziegler present hundreds of classic proofs from what we might call the mathematical canon, based in large part on suggestions by Paul Erdős. The authors confine themselves to proofs requiring “only a modest amount of technique from undergraduate mathematics”. Nothing too advanced or specialised, but nevertheless, a selection of insightful techniques across the mathematical landscape. Here we look at an Isabelle/HOL proof that the exponential function yields irrational numbers.

### Getting started

Let’s be precise about that claim: $\exp r$ is irrational for all rational, nonzero $r$. Since $\exp(\ln x)=x$, it follows that $\ln r$ is irrational for every positive rational $r≠1$.

The authors present a simple 19th-century proof (alternative source) that relies on nothing other than differentiation and integration. I have recently formalised it for the Archive of Formal Proofs and we can look at some highlights here.

The proof begins by defining the function

\begin{align*} f(x) &= \frac{x^n (1-x)^n}{n!}, \end{align*}

which goes straightforwardly into Isabelle. (You shouldn’t reserve f for the name of a constant, though.) Note that $n$ above becomes a parameter of the function we define.

definition hf where "hf ≡ λn. λx::real. (x^n * (1-x)^n) / fact n"


The text notes that the coefficients of $f$ are integers, hinting that this fact is sufficient, although I could not see a way to carry the proof right to the end without calculating the coefficients exactly. So here they are.

definition cf where "cf ≡ λn i. if i < n then 0 else (n choose (i-n)) * (-1)^(i-n)"


### A few simple proofs

The proof that they are the right coefficients may be instructive. Sums are tedious to manipulate, especially when we perform a change of variables. (See below near sum.reindex, where we change the index range from $0\ldots n$ to $n\ldots 2n$. The authors did not give an explicit proof of this result (having not even stated what the coefficients were), but had they done so, they probably would have shown many of the steps that appear in the formal proof below.

lemma hf_int_poly:
fixes x::real
shows "hf n = (λx. (1 / fact n) * (∑i=0..2*n. real_of_int (cf n i) * x^i))"
proof
fix x
have inj: "inj_on ((+)n) {..n}"
by (auto simp: inj_on_def)
have [simp]: "((+)n)  {..n} = {n..2*n}"
have "(x^n * (-x + 1)^n) = x ^ n * (∑k≤n. real (n choose k) * (- x) ^ k)"
unfolding binomial_ring by simp
also have "... = x ^ n * (∑k≤n. real_of_int ((n choose k) * (-1)^k) * x ^ k)"
by (simp add: mult.assoc flip: power_minus)
also have "... = (∑k≤n. real_of_int ((n choose k) * (-1)^k) * x ^ (n+k))"
also have "... = (∑i=n..2*n. real_of_int (cf n i) * x^i)"
by (simp add: sum.reindex [OF inj, simplified] cf_def)
finally have "hf n x = (1 / fact n) * (∑i = n..2 * n. real_of_int (cf n i) * x^i)"
moreover have "(∑i = 0..<n. real_of_int (cf n i) * x^i) = 0"
ultimately show "hf n x = (1 / fact n) * (∑i = 0..2 * n. real_of_int (cf n i) * x^i)"
using sum.union_disjoint [of "{0..<n}" "{n..2*n}" "λi. real_of_int (cf n i) * x^i"]
by (simp add: ivl_disj_int_two(7) ivl_disj_un_two(7) mult_2)
qed


The text states the following fact about $f$ (for any given $n$) with strict inequalities. This non-strict version is easier to prove (one single line) and is all we need.

lemma
assumes "0 ≤ x" "x ≤ 1"
shows hf_nonneg: "0 ≤ hf n x" and hf_le_inverse_fact: "hf n x ≤ 1/fact n"
using assms by (auto simp: hf_def divide_simps mult_le_one power_le_one)


That $f$ is differentiable is proved by calculating and discarding its derivative. This task is trivial thanks to the trick of using built-in bundles of rules for derivatives, which we have seen before.

lemma hf_differt [iff]: "hf n differentiable at x"
unfolding hf_int_poly differentiable_def
by (intro derivative_eq_intros exI | simp)+


The following derivative calculation takes a dozen-plus lines to prove because the result that emerges automatically needs to have its index shifted by one. The (omitted) manipulations resemble those we’ve just seen.

lemma deriv_sum_int:
"deriv (λx. ∑i=0..n. of_int (c i) * x^i) x
= (if n=0 then 0 else (∑i=0..n - Suc 0. real_of_int ((int i + 1) * c (Suc i)) * x^i))"


### Reasoning about iterated derivatives by induction

We have been working towards proving a claim in the text: that the derivatives $f^{(k)} (0)$ and $f^{(k)} (1)$ are integers for all $k\ge0$. The text contains a four line argument only, but it seems essential to calculate the $k$-th derivatives exactly. The proof, somewhat cluttered with conversions from type nat to int to real, is by induction on $k$. We are dealing with products of sets of integers.

lemma hf_deriv_int_poly:
"(deriv^^k) (hf n) = (λx. (1/fact n)*(∑i=0..2*n-k. of_int(int(∏{i<..i+k}) * cf n (i+k)) * x^i))"
proof (induction k)
case 0
show ?case
next
case (Suc k)
define F where "F ≡ λx. (∑i = 0..2*n - k. real_of_int (int(∏{i<..i+k}) * cf n (i+k)) * x^i)"
have Fd: "F field_differentiable at x" for x
unfolding field_differentiable_def F_def
by (rule derivative_eq_intros exI | force)+
have [simp]: "prod int {i<..Suc (i + k)} = (1 + int i) * prod int {Suc i<..Suc (i + k)}" for i
have "deriv (λx. F x / fact n) x
= (∑i = 0..2 * n - Suc k. of_int (int(∏{i<..i+ Suc k}) * cf n (Suc (i+k))) * x^i) / fact n" for x
unfolding deriv_cdivide_right [OF Fd]
by (fastforce simp add: F_def deriv_sum_int cf_def simp flip: of_int_mult intro: sum.cong)
then show ?case
qed


So the derivative $f^{(k)} (0)$ is indeed an integer.

lemma hf_deriv_0: "(deriv^^k) (hf n) 0 ∈ ℤ"
proof (cases "n ≤ k")
case True
then obtain j where "(fact k::real) = real_of_int j * fact n"
by (metis fact_dvd dvd_def mult.commute of_int_fact of_int_mult)
moreover have "prod real {0<..k} = fact k"
ultimately show ?thesis
next
case False
then show ?thesis
qed


To prove that $f^{(k)} (1)$ is also an integer, we have to work harder. First we prove that $f’(x) = - {f’(1-x)}$. The proof is annoyingly long because to write the chain rule you need to make the function composition explicit.

lemma deriv_hf_minus: "deriv (hf n) = (λx. - deriv (hf n) (1-x))"
proof
fix x
have "hf n = hf n ∘ (λx. (1-x))"
by (simp add: fun_eq_iff hf_def mult.commute)
then have "deriv (hf n) x = deriv (hf n ∘ (λx. (1-x))) x"
by fastforce
also have "... = deriv (hf n) (1-x) * deriv ((-) 1) x"
by (intro real_derivative_chain) auto
finally show "deriv (hf n) x = - deriv (hf n) (1-x)"
by simp
qed


Then we need a quick proof that all the necessary derivatives exist.

lemma deriv_n_hf_diffr [iff]: "(deriv^^k) (hf n) field_differentiable at x"
unfolding field_differentiable_def hf_deriv_int_poly
by (rule derivative_eq_intros exI | force)+


Then, we prove another claim about $k$-th derivatives: that $f^{(k)}(x) = (−1)^k f^{(k)}(1−x)$. It is an easy induction.

lemma deriv_n_hf_minus: "(deriv^^k) (hf n) = (λx. (-1)^k * (deriv^^k) (hf n) (1-x))"
proof (induction k)
case 0
then show ?case
next
case (Suc k)
have o: "(λx. (deriv ^^ k) (hf n) (1-x)) = (deriv ^^ k) (hf n) ∘ (-) 1"
by auto
show ?case
proof
fix x
have [simp]: "((deriv^^k) (hf n) ∘ (-) 1) field_differentiable at x"
by (force intro: field_differentiable_compose)
have "(deriv ^^ Suc k) (hf n) x = deriv (λx. (-1) ^ k * (deriv ^^ k) (hf n) (1-x)) x"
by simp (metis Suc)
also have "... = (-1) ^ k * deriv (λx. (deriv ^^ k) (hf n) (1-x)) x"
using o by fastforce
also have "... = (-1) ^ Suc k * (deriv ^^ Suc k) (hf n) (1-x)"
by (subst o, subst deriv_chain, auto)
finally show "(deriv ^^ Suc k) (hf n) x = (-1) ^ Suc k * (deriv ^^ Suc k) (hf n) (1-x)" .
qed
qed


Finally we are able to show that the derivatives $f^{(k)} (1)$ are all integers. Moreover, for $k>2n$, the derivatives vanish.

lemma hf_deriv_1: "(deriv^^k) (hf n) 1 ∈ ℤ"
by (smt (verit, best) Ints_1 Ints_minus Ints_mult Ints_power deriv_n_hf_minus hf_deriv_0)

lemma hf_deriv_eq_0: "k > 2*n ⟹ (deriv^^k) (hf n) = (λx. 0)"
by (force simp add: cf_def hf_deriv_int_poly)


### Time for the big one

The main argument lies in the proof that all integer exponentials are irrational. We assume that $\exp s$ has the form $a/b$ where $s$, $a$, $b>0$. We define $n$ as the larger of $a^2$ and $3s^3$, from which we eventually prove that $n! > as^{2n+1}$. We define

\begin{align*} F(x) = \sum_{i\le 2n} (-1)^i s^{2n-i} f^{(i)} (x). \end{align*}

and then show that

\begin{align*} F'(x) = -{sF(x)} + s^{2n+1} f(x). \end{align*}

We put $N = aF(1) − bF(0)$, show $N$ to be an integer, and eventually show by bounding an integral that $0<N<1$, contradiction. A few subproofs of a dozen or so lines are omitted below. The full development is available online.

lemma exp_nat_irrational:
assumes "s > 0" shows "exp (real_of_int s) ∉ ℚ"
proof
assume "exp (real_of_int s) ∈ ℚ"
then obtain a b where ab: "a > 0" "b > 0" "coprime a b" and exp_s: "exp s = of_int a / of_int b"
using Rats_cases' div_0 exp_not_eq_zero of_int_0
by (smt (verit, best) exp_gt_zero of_int_0_less_iff zero_less_divide_iff)
define n where "n ≡ nat (max (a^2) (3 * s^3))"
then have ns3: "s^3 ≤ real n / 3"
by linarith
have "n > 0"
using ‹a > 0› n_def by (smt (verit, best) zero_less_nat_eq zero_less_power)
have s_le: "s ^ (2*n+1) ≤ (n / exp 1) ^ n"
omitted
have a_less: "a < sqrt (2*pi*n)"
omitted
have "sqrt (2*pi*n) * (n / exp 1) ^ n > a * s ^ (2*n+1)"
using mult_strict_right_mono [OF a_less] mult_left_mono [OF s_le]
by (smt (verit, best) s_le ab(1) assms of_int_1 of_int_le_iff of_int_mult zero_less_power)
then have n: "fact n > a * s ^ (2*n+1)"
using fact_bounds(1) by (smt (verit, best) ‹0 < n› of_int_fact of_int_less_iff)
define F where "F ≡ λx. ∑i≤2*n. (-1)^i * s^(2*n-i) * (deriv^^i) (hf n) x"
have Fder [derivative_intros]: "(F has_real_derivative -s * F x + s ^ (2*n+1) * hf n x) (at x)" for x
omitted
have F01_Ints: "F 0 ∈ ℤ" "F 1 ∈ ℤ"
by (simp_all add: F_def hf_deriv_0 hf_deriv_1 Ints_sum)
define sF where "sF ≡ λx. exp (of_int s * x) * F x"
define sF' where "sF' ≡ λx. of_int s ^ Suc(2*n) * (exp (of_int s * x) * hf n x)"
have sF_der: "(sF has_real_derivative sF' x) (at x)" for x
unfolding sF_def sF'_def
by (rule refl derivative_eq_intros | force simp: algebra_simps)+
let ?N = "b * integral {0..1} sF'"
have sF'_integral: "(sF' has_integral sF 1 - sF 0) {0..1}"
by (smt (verit) fundamental_theorem_of_calculus has_vector_derivative_at_within
has_field_derivative_iff_has_vector_derivative sF_der)
then have "?N = a * F 1 - b * F 0"
using ‹b > 0› by (simp add: integral_unique exp_s sF_def algebra_simps)
also have "... ∈ ℤ"
using hf_deriv_1 by (simp add: F01_Ints)
finally have N_Ints: "?N ∈ ℤ" .
have "sF' (1/2) > 0" and ge0: "⋀x. x ∈ {0..1} ⟹ 0 ≤ sF' x"
using assms by (auto simp add: sF'_def hf_def)
moreover have "continuous_on {0..1} sF'"
unfolding sF'_def hf_def by (intro continuous_intros) auto
ultimately have False if "(sF' has_integral 0) {0..1}"
using has_integral_0_cbox_imp_0 [of 0 1 sF' "1/2"] that by auto
then have "integral {0..1} sF' > 0"
by (metis ge0 has_integral_nonneg integral_unique order_le_less sF'_integral)
then have "0 < ?N"
by (simp add: ‹b > 0›)
have "integral {0..1} sF' = of_int s ^ Suc(2*n) * integral {0..1} (λx. exp (s*x) * hf n x)"
unfolding sF'_def by force
also have "... ≤ of_int s ^ Suc(2*n) * (exp s * (1 / fact n))"
omitted
finally have "?N ≤ b * of_int s ^ Suc(2*n) * exp s * (1 / fact n)"
using ‹b > 0› by (simp add: sF'_def mult_ac divide_simps)
also have "... < 1"
using n apply (simp add: field_simps exp_s)
by (metis of_int_fact of_int_less_iff of_int_mult of_int_power)
finally show False
using ‹0 < ?N› Ints_cases N_Ints by force
qed


The full result, for rational arguments, should be clear from the Isabelle text alone.

theorem exp_irrational:
fixes q::real assumes "q ∈ ℚ" "q ≠ 0" shows "exp q ∉ ℚ"
proof
assume q: "exp q ∈ ℚ"
obtain s t where "s ≠ 0" "t > 0" "q = of_int s / of_int t"
by (metis Rats_cases' assms div_0 of_int_0)
then have "(exp q) ^ (nat t) = exp s"
by (smt (verit, best) exp_divide_power_eq of_nat_nat zero_less_nat_eq)
moreover have "exp q ^ (nat t) ∈ ℚ"
`