Overhearing (mainly) computer scientists talking to each other
(My) shallow thoughts about deep learning, III: a quick review of the remaining sessions of The Workshop.
We can imagine other futures, but to do so, we have to maintain independence from the narrative being pushed by those who believe that “AGI” is desirable and that LLMs are the path to it. (Emily Bender, “Resist the Urge To Be Impressed”)
The panels not covered in the previous essays in this series contained very little material relevant to the aims of this newsletter. They are interesting nevertheless for what they reveal, in part inadvertently, about the profound differences between the aims and outlooks of computer scientists and mathematicians. There’s a particularly telling moment about an hour into the panel
Day 3: Research Advances in Computer Science1
in which the participants, all computer scientists, betray their puzzlement over what makes mathematicians tick. At around 1:08:30, moderator Talia Ringer asked:
Do you think mathematicians really need these fancy logics and fancy type theories? … How would you convince a mathematician that it might be worth investing more in … foundations … and that it might interact with what they can do?
An old friend, a logician who works in a computer science department, has for several years been asking me why most mathematicians appear to be indifferent to the methods exposed in The Workshop. Ringer’s question may be motivated by similar experience, which my friend finds incomprehensible. My article in Quanta represents one attempt to answer my friend’s question. Of course the panelists had not read my article, nor anything else, written by a mathematician, that doesn’t conform to their preconceptions, so they were forced to speculate.
Nanevski referred to an earlier talk (presumably Commelin’s on Scholze’s Liquid Tensor Experiment); although he didn’t remember the names, he gathered that the mathematician “now has a better understanding.”
For program verification … mechanization matters because it is a way to understand the structure of the problem. I wonder whether that’s the same for mathematicians. If it is, then I guess they will take up mechanizing; if it isn’t, then so be it.
Pientka observed that when you try to formalize something in a proof assistant “you have to really rethink” your ideas. Here Nanevski and Pientka agreed that “mechanization will lead to new math.” And then Pientka referred to Vladimir Voevodsky’s article, whose subtitle — “A Personal Mission to Develop Computer Proof Verification to Avoid Mathematical Mistakes” — offers a possible answer to Ringer’s question — “to bring more certainty to the whole game of mathematics.”
[Remember this expression: “game of mathematics.”]
This last answer is relevant at least to some mathematicians but still seems marginal to the mathematical community as a whole. Shinnar remarked that, having mechanized a number of proofs, he finds they
…have mistakes all the time. The problem is convincing mathematicians the mistakes are a problem. Most of the mistakes come down to the software engineering aspects of mathematics… someone has a theorem which has some dependencies, some assumptions, and sometimes they just get lazy about stating [some technical conditions].2 At the end of the day most of these cases wind up being resolvable,… unfortunately there’s an issue in the math community where you’re trying to publish this kind of result… they’re just like, eh morally … the result was right and we don’t care, roughly speaking. And I’m not saying they’re wrong, I’m just saying there’s a bit of a disconnect between our communities in terms of how much we care about those things. [My emphasis.]
Shinnar followed up this observation with a second insight, that mathematicians write things like “using the standard argument” in situations where no such argument has ever been abstracted in the form of an explicit lemma. “Everyone has sort of a feel for what the standard argument is” but since it’s not written down, either it has to be reproved over and over or else [presumably as the computer scientist collaborating on a formalization] “figuring out… what the actual lemma should be.” Mentioning how this involves slightly shifting foundations, he concluded that “mathematicians actually do do that, they just do it more intuitively.”
Watching this remarkable exchange I immediately perceived its potential to culminate in a satori moment that would have been enough on its own to justify the time and effort put into organizing The Workshop. Instead of shaking their heads in dismay at this perceived defect3 of mathematics as practiced, the panelists and their moderator could have shouted Eureka! into their screens, in unison — realizing that this (very relative) inattention to detail is a virtue4 peculiar to mathematics, a way of thinking to be cherished rather than lamented.5 But blinded by the constraints of thinking like programmers, and by the format of The Workshop that left no time for serendipitous enlightenment, the panelists let the moment pass.
Day 1, Case Studies: Proof Verification and Checking
It was Talia Ringer again who moderated this panel, with Thierry Coquand, Johann Commelin, and Greg Morisett. Unlike the others, Commelin is a mathematician, but this was basically directed at the interests of computer science, and what went on here was mostly irrelevant to the aims of this newsletter. I just mention a few points that caught my attention.
As I expected, Coquand began his talk with a standard formulation of the Central Dogma of Mathematical Formalism:
That … formal rules could be described in complete details for mathematics is a relatively recent discovery: Frege (1879), Russell-Whitehead (1910), Hilbert (1920 s)
This does not mean that intuitions, social aspects don't play a crucial role in mathematics but at the end, a mathematical argument is only accepted if it is formally correct, through an objective process that has been slowly (and is still being) made explicit.
I’ve already explained more than once, in this newsletter and elsewhere, what I see as the limitations of this perspective. Rather than repeat my arguments here, I refer the reader to Karine Chemla’s introduction to this book, which raises the fundamental challenge to the Central Dogma on historical grounds:
on what basis do we grant ‘incontrovertibility’ as the essential value and goal of mathematical proof within mathematics itself?
and
Why [in view of the evidence presented in the book] should historians or philosophers opt for one idea as the correct one and civilize the past, let alone the present, on this basis?
Chevalley’s attempt to organize a seminar on the proof of the Feit-Thompson Theorem, which ended in frustration after two years, gave Coquand the opportunity to call Jean-Pierre Serre to the witness stand in support of formalization: “…it is not a very comfortable situation,” Serre stated in an interview, to have to accept such theorems “on faith.” Serre was also “uneasy” with blackboard proofs in differential topology of theorems “taking place in five dimensions or more.” In view of what came next — which Coquand neglected to mention — it’s fair to wonder whether Serre felt less uneasy6 when he learned that the theorem has been formally verified:
Computers … are also very useful when there is a large search to be made … A notorious example is the proof of the Four-Colour Theorem. There is however a problem there, somewhat similar to the one with Feit-Thompson : such a proof cannot be checked by hand; you need a computer (and a very subtle program). This is not very comfortable either.
Coquand concluded his presentation with a challenge:
AI can be incredibly good for some games; why not in the game of proving mathematical theorems?
Several other participants (notably Pientka, as mentioned above) also compared proving to playing games. A recent major study by Catarina Dutilh Novaes argues
that dialogical, debating practices in a democratic city-state like Athens were causally instrumental for the emergence of the axiomatic-deductive method in mathematics.7
It’s in the eye of the beholder whether this long-term evolution of proof from its roots in democratic debate to its formalization, as one kind of game among many, is or is not a good thing; anyway, where proof lands over the next century is not something we can predict. The reader will, however, note that one can resolve the tension simply between, on the one hand, the attempt to trace formalism in proof to Frege (who wished “[t]o prevent anything intuitive from penetrating here unnoticed”8) and, on the other hand, the vision of proof as a game, by positing mathematics as a game played entirely by zombies devoid of intuition.
If I were keeping score on this panel, Commelin would get points for mentioning (in passing) ecological concerns — presumably by this he meant the consumption of energy (and its associated carbon budget) and water. During his 15-minute presentation the final day, Alex Kontorovich pointed out that LLM’s like GPT or Bard are trained 1 trillion lines of data, whereas formalized libraries have fewer than 10 million. It shouldn’t be hard to estimate just how much CO2 and water will be needed to expand those libraries by a factor of 100,000, and it would be desirable as well as socially responsible for the organizing committee to devote an entire panel of a future meeting to precisely this issue. Who thinks this will happen?
Apart from this, I have nothing to add to Commelin’s talk, except to advise readers to look out for his forthcoming article with Adam Topaz, in the special issue of the Bulletin of the American Mathematical Society that builds on his talk at last October’s meeting in Toronto. Oh yes, he has a challenge too:
Morrisett spoke about formal software verification, which I’ve repeatedly cited as the original motivation for formalized proofs, where they are probably indispensable. No more need be said about this. However, I couldn’t help noticing this sinister slide.
Drones are a dual-use technology: they can be used to blow up buildings and also to help search for survivors in the rubble. This is a reminder that it is incumbent upon academics as well as military contractors to review the ethical guidelines of the Chiodo-Müller Manifesto for the Responsible Development of Mathematical Works, notably points 1.1.2:
Identify your funders, their motives, and what they can do with your work after you hand it over. Investigate their background and history for any questionable activities, or factors that may pose a risk to you or to others.… Do you know who you’re dealing with?
and 1.2.2:
Play devil’s advocate to understand how an irresponsible entity might utilise your mathematical output, tool or the project as a whole to deliberately inflict harm. Investigate if any sub-solutions and components of your product can be used for harm. What deliberate harm is possible?
In the Q&A, moderator Ringer read a question by Ernie Davis: how in practice do you verify know that the formal statement actually corresponds to the English statement? Commelin answered: checking this for a hypothetical formal proof of Fermat’s Last Theorem would simply amount to checking that the definition of the natural numbers and their arithmetic in the theorem prover is correct. For a complicated statement like the liquid tensor experiment this is completely impossible; you’d have to check more than 1000 definitions. Instead they applied a form of abductive reasoning, also known as the duck test. It’s interesting to me that Commelin’s team took this seriously — you can hear the details at 4:07:00 — because, as I have already pointed out in this newsletter, it would be paradoxical to say that one can prove rigorously that the two are the same if only one of the statements is already formalized. I take this to confirm my belief that formalization brings no metaphysical benefit, though it may provide spiritual comfort. And indeed, computer science is not primarily concerned with metaphysics.
Like the last two panels, these two
Day 1, Development of Datasets Specific to the Mathematical Sciences
Day 2, Mathematical Foundations of Machine Learning
mainly showed computer scientists talking to each other, a conversation to which I have nothing to contribute; and I see nothing in my notes here that is really relevant to the concerns of this newsletter. So there is no reason to grade these panels on my Scorecard; but if I were to give grades then Rebecca Willett’s presentation in the second of the two panels would be given points for Context for this slide
which reminds us that machine learning is above all an investment opportunity9; and points for Externalities for this slide:
My attempt to complete my report in a single post bumped up against the “email length limit.” It will therefore be continued, immediately, in a separate post, starting with a brief account of a panel discussion with Ursula Martin and Stanislas Dehaene.
Moderated by Talia Ringer, with Brigitte Pientka, Aleksandar Nanevski, and Avraham Shinnar.
Here (around 1:12:08) the four computer scientists nodded rolled their eyes as if in recognition. As for me, I’m still wondering whether I want to visit a world in which mathematics has “software engineering aspects.”
In Nerdspeak: bug. For Shinnar it’s actually worse: an effect of laziness, which is the deadly sin of sloth.
In Nerdspeak: feature. If the panelists had ever tried to publish in a mathematics journal they would have realized that referees and editors strongly enforce this virtue by advising authors to leave out superfluous details.
Mathematicians often have similar complaints about the failure of physicists to provide rigorous foundations for intuitions that, in many cases, have led to the creations of entire branches of mathematics that rigorous thinking might never have uncovered.
A colleague who spoke to Serre about just this formalization didn’t go so far, but he did say that Serre was “interested.”
C. Dutilh Novaes, The Dialogical Roots of Deduction, Cambridge University Press (2020), p. xi. In this the author relies on the work of G.E.R. Lloyd and R. Netz.
Begriffschrift, Preface, p. 5.
Evgeny Morozov also reminds us of this in his NY Times opinion.
Last year, [Sam] Altman stated that “A.G.I. is probably necessary for humanity to survive” because “our problems seem too big” for us to “solve without better tools.” He’s recently asserted that A.G.I. will be a catalyst for human flourishing.
But companies need profits, and such benevolence, especially from unprofitable firms burning investors’ billions, is uncommon. OpenAI, having accepted billions from Microsoft, has contemplated raising another $100 billion to build A.G.I. Those investments will need to be earned back — against the service’s staggering invisible costs. (One estimate from February put the expense of operating ChatGPT at $700,000 per day.)