A classical proof: exponentials are irrational

16 Feb 2022

In Proofs from THE BOOK, Aigner and Ziegler present hundreds of classic proofs from what we might call the mathematical canon, based in large part on suggestions by Paul Erdős. The authors confine themselves to proofs requiring “only a modest amount of technique from undergraduate mathematics”. Nothing too advanced or specialised, but nevertheless, a selection of insightful techniques across the mathematical landscape. Here we look at an Isabelle/HOL proof that the exponential function yields irrational numbers.

Getting started

Let’s be precise about that claim: $\exp r$ is irrational for all rational, nonzero $r$. Since $\exp(\ln x)=x$, it follows that $\ln r$ is irrational for every positive rational $r≠1$.

The authors present a simple 19th-century proof (alternative source) that relies on nothing other than differentiation and integration. I have recently formalised it for the Archive of Formal Proofs and we can look at some highlights here.

The proof begins by defining the function

\[\begin{align*} f(x) &= \frac{x^n (1-x)^n}{n!}, \end{align*}\]

which goes straightforwardly into Isabelle. (You shouldn’t reserve f for the name of a constant, though.) Note that $n$ above becomes a parameter of the function we define.

definition hf where "hf  λn. λx::real. (x^n * (1-x)^n) / fact n"

The text notes that the coefficients of $f$ are integers, hinting that this fact is sufficient, although I could not see a way to carry the proof right to the end without calculating the coefficients exactly. So here they are.

definition cf where "cf  λn i. if i < n then 0 else (n choose (i-n)) * (-1)^(i-n)"

A few simple proofs

The proof that they are the right coefficients may be instructive. Sums are tedious to manipulate, especially when we perform a change of variables. (See below near sum.reindex, where we change the index range from $0\ldots n$ to $n\ldots 2n$. The authors did not give an explicit proof of this result (having not even stated what the coefficients were), but had they done so, they probably would have shown many of the steps that appear in the formal proof below.

lemma hf_int_poly:
  fixes x::real
  shows "hf n = (λx. (1 / fact n) * (i=0..2*n. real_of_int (cf n i) * x^i))"
proof
  fix x
  have inj: "inj_on ((+)n) {..n}"
    by (auto simp: inj_on_def)
  have [simp]: "((+)n) ` {..n} = {n..2*n}"
    using nat_le_iff_add by fastforce
  have "(x^n * (-x + 1)^n) = x ^ n * (kn. real (n choose k) * (- x) ^ k)"
    unfolding binomial_ring by simp
  also have "... = x ^ n * (kn. real_of_int ((n choose k) * (-1)^k) * x ^ k)"
    by (simp add: mult.assoc flip: power_minus)
  also have "... = (kn. real_of_int ((n choose k) * (-1)^k) * x ^ (n+k))"
    by (simp add: sum_distrib_left mult_ac power_add)
  also have "... = (i=n..2*n. real_of_int (cf n i) * x^i)"
    by (simp add: sum.reindex [OF inj, simplified] cf_def)
  finally have "hf n x = (1 / fact n) * (i = n..2 * n. real_of_int (cf n i) * x^i)"
    by (simp add: hf_def)
  moreover have "(i = 0..<n. real_of_int (cf n i) * x^i) = 0"
    by (simp add: cf_def)
  ultimately show "hf n x = (1 / fact n) * (i = 0..2 * n. real_of_int (cf n i) * x^i)"
    using sum.union_disjoint [of "{0..<n}" "{n..2*n}" "λi. real_of_int (cf n i) * x^i"]
    by (simp add: ivl_disj_int_two(7) ivl_disj_un_two(7) mult_2)
qed

The text states the following fact about $f$ (for any given $n$) with strict inequalities. This non-strict version is easier to prove (one single line) and is all we need.

lemma
  assumes "0  x" "x  1"
  shows hf_nonneg: "0  hf n x" and hf_le_inverse_fact: "hf n x  1/fact n"
  using assms by (auto simp: hf_def divide_simps mult_le_one power_le_one)  

That $f$ is differentiable is proved by calculating and discarding its derivative. This task is trivial thanks to the trick of using built-in bundles of rules for derivatives, which we have seen before.

lemma hf_differt [iff]: "hf n differentiable at x"
  unfolding hf_int_poly differentiable_def
  by (intro derivative_eq_intros exI | simp)+

The following derivative calculation takes a dozen-plus lines to prove because the result that emerges automatically needs to have its index shifted by one. The (omitted) manipulations resemble those we’ve just seen.

lemma deriv_sum_int:
  "deriv (λx. i=0..n. of_int (c i) * x^i) x
     = (if n=0 then 0 else (i=0..n - Suc 0. real_of_int ((int i + 1) * c (Suc i)) * x^i))"

Reasoning about iterated derivatives by induction

We have been working towards proving a claim in the text: that the derivatives $f^{(k)} (0)$ and $f^{(k)} (1)$ are integers for all $k\ge0$. The text contains a four line argument only, but it seems essential to calculate the $k$-th derivatives exactly. The proof, somewhat cluttered with conversions from type nat to int to real, is by induction on $k$. We are dealing with products of sets of integers.

lemma hf_deriv_int_poly:
   "(deriv^^k) (hf n) = (λx. (1/fact n)*(∑i=0..2*n-k. of_int(int({i<..i+k}) * cf n (i+k)) * x^i))"
proof (induction k)
  case 0
  show ?case
    by (simp add: hf_int_poly)
next
  case (Suc k)
  define F where "F  λx. (i = 0..2*n - k. real_of_int (int({i<..i+k}) * cf n (i+k)) * x^i)"
  have Fd: "F field_differentiable at x" for x
    unfolding field_differentiable_def F_def
    by (rule derivative_eq_intros exI | force)+
  have [simp]: "prod int {i<..Suc (i + k)} = (1 + int i) * prod int {Suc i<..Suc (i + k)}" for i
    by (metis Suc_le_mono atLeastSucAtMost_greaterThanAtMost le_add1 of_nat_Suc prod.head)
  have "deriv (λx. F x / fact n) x
      = (i = 0..2 * n - Suc k. of_int (int({i<..i+ Suc k}) * cf n (Suc (i+k))) * x^i) / fact n" for x
    unfolding deriv_cdivide_right [OF Fd]
    by (fastforce simp add: F_def deriv_sum_int cf_def simp flip: of_int_mult intro: sum.cong)
  then show ?case
    by (simp add: Suc F_def)
qed

So the derivative $f^{(k)} (0)$ is indeed an integer.

lemma hf_deriv_0: "(deriv^^k) (hf n) 0  "
proof (cases "n  k")
  case True
  then obtain j where "(fact k::real) = real_of_int j * fact n"
    by (metis fact_dvd dvd_def mult.commute of_int_fact of_int_mult)
  moreover have "prod real {0<..k} = fact k"
    by (simp add: fact_prod atLeastSucAtMost_greaterThanAtMost)
  ultimately show ?thesis
    by (simp add: hf_deriv_int_poly dvd_def)
next
  case False
  then show ?thesis
    by (simp add: hf_deriv_int_poly cf_def)
qed

To prove that $f^{(k)} (1)$ is also an integer, we have to work harder. First we prove that $f’(x) = - {f’(1-x)}$. The proof is annoyingly long because to write the chain rule you need to make the function composition explicit.

lemma deriv_hf_minus: "deriv (hf n) = (λx. - deriv (hf n) (1-x))"
proof
  fix x
  have "hf n = hf n  (λx. (1-x))"
    by (simp add: fun_eq_iff hf_def mult.commute)
  then have "deriv (hf n) x = deriv (hf n  (λx. (1-x))) x"
    by fastforce
  also have "... = deriv (hf n) (1-x) * deriv ((-) 1) x"
    by (intro real_derivative_chain) auto
  finally show "deriv (hf n) x = - deriv (hf n) (1-x)"
    by simp
qed

Then we need a quick proof that all the necessary derivatives exist.

lemma deriv_n_hf_diffr [iff]: "(deriv^^k) (hf n) field_differentiable at x"
  unfolding field_differentiable_def hf_deriv_int_poly
  by (rule derivative_eq_intros exI | force)+

Then, we prove another claim about $k$-th derivatives: that $f^{(k)}(x) = (−1)^k f^{(k)}(1−x)$. It is an easy induction.

lemma deriv_n_hf_minus: "(deriv^^k) (hf n) = (λx. (-1)^k * (deriv^^k) (hf n) (1-x))"
proof (induction k)
  case 0
  then show ?case
    by (simp add: fun_eq_iff hf_def)
next
  case (Suc k)
  have o: "(λx. (deriv ^^ k) (hf n) (1-x)) = (deriv ^^ k) (hf n)  (-) 1"
    by auto
  show ?case
  proof
    fix x
    have [simp]: "((deriv^^k) (hf n)  (-) 1) field_differentiable at x"
      by (force intro: field_differentiable_compose)
    have "(deriv ^^ Suc k) (hf n) x = deriv (λx. (-1) ^ k * (deriv ^^ k) (hf n) (1-x)) x"
      by simp (metis Suc)
    also have "... = (-1) ^ k * deriv (λx. (deriv ^^ k) (hf n) (1-x)) x"
      using o by fastforce
    also have "... = (-1) ^ Suc k * (deriv ^^ Suc k) (hf n) (1-x)"
      by (subst o, subst deriv_chain, auto)
    finally show "(deriv ^^ Suc k) (hf n) x = (-1) ^ Suc k * (deriv ^^ Suc k) (hf n) (1-x)" .
  qed
qed

Finally we are able to show that the derivatives $f^{(k)} (1)$ are all integers. Moreover, for $k>2n$, the derivatives vanish.

lemma hf_deriv_1: "(deriv^^k) (hf n) 1  "
  by (smt (verit, best) Ints_1 Ints_minus Ints_mult Ints_power deriv_n_hf_minus hf_deriv_0)

lemma hf_deriv_eq_0: "k > 2*n  (deriv^^k) (hf n) = (λx. 0)"
  by (force simp add: cf_def hf_deriv_int_poly)

Time for the big one

The main argument lies in the proof that all integer exponentials are irrational. We assume that $\exp s$ has the form $a/b$ where $s$, $a$, $b>0$. We define $n$ as the larger of $a^2$ and $3s^3$, from which we eventually prove that $n! > as^{2n+1}$. We define

\[\begin{align*} F(x) = \sum_{i\le 2n} (-1)^i s^{2n-i} f^{(i)} (x). \end{align*}\]

and then show that

\[\begin{align*} F'(x) = -{sF(x)} + s^{2n+1} f(x). \end{align*}\]

We put $N = aF(1) − bF(0)$, show $N$ to be an integer, and eventually show by bounding an integral that $0<N<1$, contradiction. A few subproofs of a dozen or so lines are omitted below. The full development is available online.

lemma exp_nat_irrational:
  assumes "s > 0" shows "exp (real_of_int s)  "
proof
  assume "exp (real_of_int s)  "
  then obtain a b where ab: "a > 0" "b > 0" "coprime a b" and exp_s: "exp s = of_int a / of_int b"
    using Rats_cases' div_0 exp_not_eq_zero of_int_0
    by (smt (verit, best) exp_gt_zero of_int_0_less_iff zero_less_divide_iff)
  define n where "n  nat (max (a^2) (3 * s^3))"
  then have ns3: "s^3  real n / 3"
    by linarith
  have "n > 0"
    using a > 0 n_def by (smt (verit, best) zero_less_nat_eq zero_less_power)
  have s_le: "s ^ (2*n+1)  (n / exp 1) ^ n"
    omitted
  have a_less: "a < sqrt (2*pi*n)"
    omitted
  have "sqrt (2*pi*n) * (n / exp 1) ^ n > a * s ^ (2*n+1)"
    using mult_strict_right_mono [OF a_less] mult_left_mono [OF s_le]
    by (smt (verit, best) s_le ab(1) assms of_int_1 of_int_le_iff of_int_mult zero_less_power)
  then have n: "fact n > a * s ^ (2*n+1)"
    using fact_bounds(1) by (smt (verit, best) 0 < n of_int_fact of_int_less_iff)
  define F where "F  λx. i2*n. (-1)^i * s^(2*n-i) * (deriv^^i) (hf n) x"
  have Fder [derivative_intros]: "(F has_real_derivative -s * F x + s ^ (2*n+1) * hf n x) (at x)" for x
    omitted
  have F01_Ints: "F 0  " "F 1  "
    by (simp_all add: F_def hf_deriv_0 hf_deriv_1 Ints_sum)
  define sF where "sF  λx. exp (of_int s * x) * F x"
  define sF' where "sF'  λx. of_int s ^ Suc(2*n) * (exp (of_int s * x) * hf n x)"
  have sF_der: "(sF has_real_derivative sF' x) (at x)" for x
    unfolding sF_def sF'_def
    by (rule refl derivative_eq_intros | force simp: algebra_simps)+
  let ?N = "b * integral {0..1} sF'"
  have sF'_integral: "(sF' has_integral sF 1 - sF 0) {0..1}"
    by (smt (verit) fundamental_theorem_of_calculus has_vector_derivative_at_within
            has_field_derivative_iff_has_vector_derivative sF_der)
  then have "?N = a * F 1 - b * F 0"
    using b > 0 by (simp add: integral_unique exp_s sF_def algebra_simps)
  also have "...  "
    using hf_deriv_1 by (simp add: F01_Ints)
  finally have N_Ints: "?N  " .
  have "sF' (1/2) > 0" and ge0: "x. x  {0..1}  0  sF' x"
    using assms by (auto simp add: sF'_def hf_def)
  moreover have "continuous_on {0..1} sF'"
    unfolding sF'_def hf_def by (intro continuous_intros) auto
  ultimately have False if "(sF' has_integral 0) {0..1}"
    using has_integral_0_cbox_imp_0 [of 0 1 sF' "1/2"] that by auto
  then have "integral {0..1} sF' > 0"
    by (metis ge0 has_integral_nonneg integral_unique order_le_less sF'_integral)
  then have "0 < ?N"
    by (simp add: b > 0)
  have "integral {0..1} sF' = of_int s ^ Suc(2*n) * integral {0..1} (λx. exp (s*x) * hf n x)"
    unfolding sF'_def by force
  also have "...  of_int s ^ Suc(2*n) * (exp s * (1 / fact n))"
    omitted
  finally have "?N  b * of_int s ^ Suc(2*n) * exp s * (1 / fact n)"
    using b > 0 by (simp add: sF'_def mult_ac divide_simps)
  also have "... < 1"
    using n apply (simp add: field_simps exp_s)
    by (metis of_int_fact of_int_less_iff of_int_mult of_int_power)
  finally show False
    using 0 < ?N Ints_cases N_Ints by force
qed

The full result, for rational arguments, should be clear from the Isabelle text alone.

theorem exp_irrational:
  fixes q::real assumes "q  " "q  0" shows "exp q  "
proof
  assume q: "exp q  "
  obtain s t where "s  0" "t > 0" "q = of_int s / of_int t"
    by (metis Rats_cases' assms div_0 of_int_0)
  then have "(exp q) ^ (nat t) = exp s"
    by (smt (verit, best) exp_divide_power_eq of_nat_nat zero_less_nat_eq)
  moreover have "exp q ^ (nat t)  "
    by (simp add: q)
  ultimately show False
    by (smt (verit, del_insts) Rats_inverse s  0 exp_minus exp_nat_irrational of_int_of_nat)
qed