22 Comments

I'm reminded of Sheryl Sandberg's much-ballyhooed concept that if women only "Lean In" to conversations in work meetings, equality in the workplace would automatically result. It seems equally magical thinking to believe that if mathematicians Lean In, understanding will follow.

Expand full comment

> However, it’s impossible to prove formally that the “more modern version” is equivalent to Wiles’s original proof without formalizing the latter, so I don’t know why the computer scientists who were obsessing about this in 2005 would accept this substitute for the original.

You have returned to this point many times and I still entirely don't understand why it is relevant.

First note that even if the original version were formalized it would not be possible to prove they were equivalent, for multiple reasons: First, there is not even a well-defined mathematical notion of the equivalence of two formal proofs other than the literal equivalence of two proofs containing the same sequence of steps or, for intuitionistic proofs, the notion of two proofs producing the same function (but Wiles' proof is presumably not intuitionistic and anyways any two functions arising from intuitionistic proofs of Fermat would be the same as they would be functions from the empty set to the empty set). Second, the proofs are of course not equivalent in an informal sense, because the more modern version is, I assume, different in incorporating more modern ideas that allow certain steps of the original proof to be sketched.

But I don't see why computer scientists should care about the difficulties with proving a potential new proof is equivalent to an earlier proof. Was that ever their goal? The original problem of Jan Bergstra specifies "a proof of Fermat's Last Theorem, as proved by A. Wiles in 1995". I think the "as proved" clause is meant to provide proper credit while clarifying if necessary which theorem is meant. "a proof" clearly implies that any proof will do.

The document of Wim H. Hesselink you linked makes clear that the reason Wiles' proof is important to the author is because it also proves the modularity conjecture for semistable curves. It's clear Wim would prefer a simpler proof if it also proves this modularity statement - this is made clear in the third-to-last section. But of course one can verify that a proof proves the modularity conjecture for semistable curves without also verifying that it is equivalent to Wiles' proof.

More broadly I don't see the point in any situation of proving some formal proof is equivalent to some informal proof, and since as you point out the impossibility of this is a tautology, I think it's not reasonable to assume that people who are interested in formalization are trying to achieve this type of proven equivalence. Rather I think people have different motivations. Among these are to be sure that the theorem is correct, which does not require being sure that any past proof is correct.

But even if one's goal is to be more confident that a certain published proof is correct, there is no need to prove that the published proof is equivalent to a formal proof. Simply producing a formal proof and observing that this looks similar to the published proof gives increased confidence that the published proof is correct (in the informal sense used by ordinary mathematicians). Why is this? At least sometimes a serious error exists in a published proof. A few times, such errors have been detected in attempts to formalize a proof (i.e. produce a formal proof similar in some respects to the original proof). On the other hand no serious error can be detected if the proof is not wrong. So it follows that if one produces a formal proof without finding a serious error, the probability that no serious error exists is greater than previously, even if it's possible that a serious error in the original proof was not detected since a somewhat different proof was in fact formalized.

Expand full comment

I am probably taking this further than you intended, but there is a curious "quasi-borderline" example in "The Tempest", Act 5, Scene 1:

Their understanding

Begins to swell, and the approaching tide

Will shortly fill the reasonable shore

That now lies foul and muddy.

While this is being said, Alonso and the court party are still "under the influence of charm" and "bereft of senses", so, arguably, not fully "animate".

Expand full comment

Here is an example combining "Lean" and "understand" (from "Two Gentlemen of Verona", Act 2, Scene 5):

LAUNCE

Ay, and what I do too: look thee, I'll but lean,

and my staff understands me.

SPEED

It stands under thee, indeed.

LAUNCE

Why, stand-under and under-stand is all one.

Expand full comment

Twelfth Night, Act 3, Scene 1:

"My legs do better understand me, sir, than I understand what you mean by bidding me taste my legs."

Expand full comment

And here is an example from Macbeth (Act 3, Scene 2) of a "verse attributing `agency' to an inanimate object" (which is incidentally the origin of the expression "invisible hand"):

Come, seeling night,

Scarf up the tender eye of pitiful day

And with thy bloody and invisible hand

Cancel and tear to pieces that great bond

Which keeps me pale.

Expand full comment

Hamlet Act 1, Scene 1:

"This spirit, dumb to us, will speak to him."

is an example of a "verse attributing 'understanding' to an inanimate object".

Expand full comment

Yes, it is not easy to characterize what is going on here. There are semantic shifts in language and thought that take place without much regard for pattern in time and place. How are LLM's to capture these? For example, what would it make of the various uses of the word 'diagram' in mathematics? How would it avoid errors of the sort it made when you queried it? Buzzard appears to be promoting LEAN as an 'assistant' ... that seems quite plausible and maybe even useful. We don't seem to have found a good role for LLM's just yet...maybe when Birnam Wood comes to Dunsinane?

Expand full comment

the Macbeth line appears to have been misunderstood both by GPT and the author of the post. By 'sensible' to sight and feeling Macbeth means 'able to be sensed by' feeling and sight ...in the old philosophical jargon, these are examples of 'sensibilia' ...the different senses were said to have different things they could sense...their sensibilia (Aristotle discusses this generally in Books II and III of de Anima and specifically in de Sensu). Macbeth does not attribute feeling, sight or understanding to the dagger. Somewhere in the 17th c the notions of 'object(ive)' and 'subject(ive)' were inverted. Today 'sensible' refers to the person sensing, not the thing sensed.

Expand full comment