Two small examples by Fields medallists

28 Feb 2024

[ examples  sledgehammer  ]

A couple of weeks ago, Tim Gowers posted on Twitter an unusual characterisation of bijective functions: that they preserve set complements. Alex Kontorovich re-tweeted that post accompanied by a Lean proof detailing Gowers’ argument. I took a look, and lo and behold! Isabelle can prove it with a single sledgehammer call. (That one line proof isn’t necessarily the best proof, however. Remember, we want proofs that are easy to read and maintain.) And Terrence Tao published a small example on Mastodon; let’s look at that one too.

Gowers’ original tweet

Here is the original tweet, a thread in classical Twitter style:

I’ve just noticed that a function f:X->Y is a bijection if and only if it preserves complements, that is, if and only if f(X\A)=Y\f(A) for every subset A of X. Is this a standard fact that has somehow passed me by for four decades? Simple proof in rest of (short) thread. 1/3

If f is a bijection and B=X\A, then f preserves unions and intersections and f(X)=Y, so f(A) and f(B) are disjoint and have union equal to Y. Conversely, if f preserves complements, then setting A = emptyset, we see that f(X)=Y, so f is a surjection. 2/3

And for every x we also have that f(X{x})=Y{f(x)}. Therefore, if x and y are distinct, then so are f(x) and f(y). So f is an injection as well. 3/3

In standard mathematical notation, the claim is that if a function $f:X\to Y$ is given, then $f$ is a bijection from $X$ to $Y$ if and only if it preserves complements, i.e. if $f[X\setminus A] = Y \setminus f[A]$ for all $A\subseteq X$. Incidentally, there are various ways of writing the image under a function of a set; Here I use square brackets, while Lean and Isabelle provide their own image operators.

The Lean formalisation

Kontorovich posted his version as an image:

Formalisation of the bijection proof in Lean by Alex Kontorovich

Note that he has written out the argument in detail, with plenty of comments to explain what is going on.

Investigating the problem in Isabelle

This problem looked intriguing, so I tried it with Isabelle. The brute force way to tackle such a proof is

  1. try the auto proof method to solve or at least break up the problem
  2. invoke sledgehammer on the subgoals that are produced.

The proof you get this way is likely to be horrible. However, once you have your first proof, it’s easy to get a nicer proof. (And you should take the trouble.) For the current problem, if you type auto, you get four ugly subgoals, each of which sledgehammer proves automatically. I don’t want to show this, but you can try it for yourself. The Isabelle theory file is here.

What I actually tried, first, was to split the logical equivalence into its two directions. I was pleased to see that sledgehammer could prove both. Then I thought, let’s see if it can prove the whole claim at once, and indeed it could!

The Isabelle proofs

To begin, a technicality about notation. In Isabelle, set difference is written with a minus sign, $A-B$, because the standard backslash character is reserved for other purposes. The usual set difference symbol can be selected from the Symbols palette or typed as \setminus (autocomplete will help). So let’s begin by setting that up, allowing us to use the conventional symbol. It will be accepted for input and used to display results.

abbreviation set_difference :: "['a set,'a set]  'a set" (infixl "" 65)
  where "A  B  A-B"

The following is the nicest of the one-shot proofs found by sledgehammer. This problem turned out to be relatively easy; three of the constituent provers solved it.

lemma "bij_betw f X Y  (A. AX  f ` (XA) = Y  f`A)"
  by (metis Diff_empty Diff_eq_empty_iff Diff_subset bij_betw_def image_is_empty 
            inj_on_image_set_diff subset_antisym subset_image_inj)

I don’t actually recommend that you allow proofs of this sort to accumulate in your development. It leaves us completely in the dark as to why the claim holds. Moreover, if you want your development to be maintainable, it needs to be resilient in the presence of change. I’m always having to make corrections and adjustments (because I’m always making mistakes), And while rerunning the proofs can be an anxious moment, usually they all work fine. At worst, they can be fixed by another sledgehammer call. Opaque proofs like the one above will be hard to fix when they break.

The simplest way to get a clearer proof for this particular problem is by separately treating the left-to-right and right-to-left directions. This is also an opportunity to see the is mechanism for matching a pattern to a formula. An arbitrary pattern is permitted, and here we set up ?L and ?R to denote the left and right hand sides.

lemma "bij_betw f X Y  (A. AX  f ` (XA) = Y  f`A)"  (is "?L=?R")
proof
  show "?L  ?R"
    by (metis Diff_subset bij_betw_def inj_on_image_set_diff)
  assume ?R
  then have "inj_on f X" "f ` X = Y"
    by (auto simp: inj_on_def)
  then show ?L
    by (simp add: bij_betw_def)
qed

This proof is much clearer. The left to write proof requires only three previous facts. The right to left proof is practically automatic. You might argue that even here, the actual reasoning is still opaque. However, this proof tells us that the right to left direction is essentially a calculation from the definitions, while the opposite direction is the consequence of three facts (rather than eight, as before). This sort of proof will be much easier to maintain.

A further Isabelle bonus: note that both the Lean proof and Gowers’ informal argument begin by assuming $f:X\to Y$. The Isabelle version states unconditionally that $f$ is a bijection from $X$ to $Y$ if and only if it preserves complements. The implicit typing of $f$ ensures only that it is a function: over arbitrary types that we don’t even mention.

Tao’s example

Unfortunately, I wasn’t able to locate Tao’s original post. But he stated a nice little problem and gave a formalisation using Lean, and again I couldn’t help trying it out in Isabelle. I liked my proof more.

We are given a decreasing real-valued sequence $\{a_k\}$ and a family of non-negative reals $\{D_k\}$ such that $a_k\le D_k - D_{k+1}$ for all $k$. The task is to prove $a_k \le \frac{D_0}{k+1}$.

lemma
  fixes a :: "nat  real"
  assumes a: "decseq a" and D: "k. D k  0" and aD: "k. a k  D k - D(Suc k)"
  shows "a k  D 0 / (Suc k)"
proof -
  have "a k = (ik. a k) / (Suc k)"
    by simp
  also have "  (ik. a i) / (Suc k)"
    using a sum_mono[of "{..k}" "λi. a k" a] 
    by (simp add: monotone_def divide_simps mult.commute)
  also have "  (ik. D i - D(Suc i)) / (Suc k)"
    by (simp add: aD divide_right_mono sum_mono)
  also have "  D 0 / (Suc k)"
    by (simp add: sum_telescope D divide_right_mono)
  finally show ?thesis .
qed

Isabelle’s calculational style is perfect for this sort of inequality chain.

Final remarks

Always break up your problem into its constituents – probably by calling auto – before calling sledgehammer. The effort needed to prove all the separate parts is generally much less than that needed prove the whole in one go. Besides which, part of your problem may simply be too difficult for sledgehammer. Better to isolate that part to work on later, while disposing of the easier bits.

The Isabelle theory file is available to download.