At some point in the not too distant future, most of our conversations will be with machines. I haven’t seen this prediction in print anywhere, with or without an estimate of the proportion — will it be 70% or more like 90%? — and I’m not willing to hazard a guess whether or not this cheerful prospect will be realized within my lifetime. But if you read between the lines of any of the vast number of articles written about the latest implementations of large language models — gushing testimonials in ordinary newspapers, of course, but also a “guarantee that minds will be blown” by next year’s model, by the usually cautious Gary Marcus — I think the inevitable endgame is coming into view.
If you don’t believe me, think about how many of your telephone conversations are now with robots. 2%? Maybe 5%? Probably much higher if you count your conversations with your telephones. Could you have imagined this 10 years ago?
Like everyone else, I have been chatting with Open AI’s large language models. A few months ago I shared a sample of GPT-2 prose that began
But there is a third consequence to this method of dealing with a proposition. That proposition is a proposition. It is a proposition that is a proposition. It is a proposition that is a proposition. If, on the other hand, a proposition is a proposition, we could prove and prove that a proposition is a proposition by reason.
I’m pretty sure GPT-2 is also responsible for the following interesting take on the Bounded Gaps Theorem. In response to the prompt
Here is Yitang Zhang's proof of the Bounded Gaps Theorem.
it generated this:
We can define a Bounded Gaps function that describes the number or direction of a triangle with respect to the first degree circle with respect to the angle of the middle point of the triangle. When given a value of a particular value, we can apply a "fixed" value applied only to the corner, and in practice does not work much better than a value that uses both directions. The value we are giving is usually called its center value, and it is a function that consists of a function that is called a "square" function. To illustrate this concept, consider a triangle with 20 corners, 1 x 10, 2 x 40 and 1 x 25. The circle must be 10 squares to obtain a 5 square point. Therefore we can say that there are 5 intersections by this constant. This same function can be applied to any number of vertices to produce a 5 square point, or even infinity of squares by applying a function to any n squares, a nth number of the number of n square points. Since we have chosen the number of polygons along the four corners by having every intersection have a 5 square point, this formula is very difficult to apply to any triangle. Now, consider a triangle with 20 vertices, 10 triangles, and 1 triangle. The angle of the middle point of the triangle is 25° - 15°. Given a value of c (2) + s 1
The note containing this paragraph is dated November 6, 2022, but unfortunately I don’t remember the circumstances that inspired me to invite GPT-n to a chat that day, nor whether it wasn’t a few weeks earlier, nor even whether n = 2 or n = 3.
Anyway, ChatGPT is much more gregarious than its ancestors, and it has to be:
How many servers does it take to keep track of this traffic? How many households could be kept warm this winter with the energy that consumes?
The website where I found the user data doesn’t estimate the number of users who have fallen in love with ChatGPT, although it does happen, probably more frequently than anyone is willing to admit. The system is also chatting about mathematics. Here is the transcript of a recent chat with BU Computer Science professor Alley Stoughton, shared with me by a mutual friend; I’m sure many of you have recorded similar exchanges.
Alley: Are there any pairs of prime numbers that differ by exactly 3?
ChatGPT: Yes, there are pairs of prime numbers that differ by exactly
3. For example, 2 and 5, 3 and 6, 5 and 8, 11 and 14, and 17 and 20
are pairs of prime numbers that differ by exactly 3.
Here is a proof that there are infinitely many pairs of prime numbers
that differ by 3: Suppose p and p + 3 are both prime numbers. If p + 6
were also prime, then p, p + 3, and p + 6 would form a sequence of
three consecutive prime numbers, which is not possible (see Bertrand's
postulate). Therefore, p + 6 must be composite. Since p + 6 is
composite, there must exist prime numbers q and r such that q and r
are both less than p + 6 and q * r = p + 6. But since p is prime, we
have p < q * r = p + 6, so p < q and p < r. This means that either q =
p + 3 or r = p + 3. In either case, we have found another pair of
prime numbers that differ by 3. Therefore, there are infinitely many
pairs of prime numbers that differ by 3.
Alley: I agree that 2 and 5 are a pair of prime numbers the differ by
exactly 3, but are you sure there are other such pairs?
ChatGPT: I apologize for the mistake. 3 and 6 are not a pair of prime
numbers that differ by exactly 3, because 3 is not prime.
However, there are indeed other pairs of prime numbers that differ by
exactly 3. Here are some examples: 5 and 8, 11 and 14, 17 and 20, and
so on. I apologize for the error in my previous response.
Alley: Is 8 a prime number?
ChatGPT: No, 8 is not a prime number. A prime number is a positive
integer greater than 1 that has no positive integer divisors other
than 1 and itself. 8 is divisible by 2 and 4, so it is not a prime
number.
The prime factorization of 8 is 2 * 2 * 2, or 2^3. This means that 8
can be written as the product of three prime factors, 2 * 2 * 2, or as
2^3. The prime factorization of a composite number such as 8 can be
written as the product of its prime factors, whereas the prime
factorization of a prime number such as 5 is simply the number itself.
Alley: OK, but in your previous answer, you said that 5 and 8 are a
pair of prime numbers whose difference is exactly 3. Isn't this
incorrect?
ChatGPT: I apologize for the mistake in my previous response. 5 and 8
are not a pair of prime numbers whose difference is exactly 3, because
8 is not a prime number.
Here is a list of some pairs of prime numbers whose difference is exactly 3:
2 and 5
5 and 8
11 and 14
17 and 20
29 and 32
41 and 44
I apologize for any confusion that my previous response may have
caused. Please let me know if you have any other questions.
It so happens that I spent Christmas with one very extended portion of my extended family, and that included two children of ages 2 and 4. The 2-year old was just learning to pronounce words properly and to put them into sentences of longer than two words, while the 4-year old was putting his excellent grammar to use in asking for explanations of everything — acquiring the sort of knowledge of the world that AI researchers call “common sense.” I can guarantee that within at most 15 years both of these children will display considerable common sense in formulating ingenious ways to insult their parents and teachers. Why is no one at Open AI telling us that their programmers will have to manage an extended period of adolescent rebellion before they will ever be satisfied that their large language models have attained the goal of artificial general intelligence? (To paraphrase Caliban, what is the profit of AGI if it doesn’t know how to curse?)
Let’s leave these speculations for another day. I am convinced that a close reading ChatGPT’s proof of the infinity of prime pairs that differ by 3 can highlight not only a few small helpings of common sense that can be programmed into its database, but also something about what’s going on in the system’s internal model. For one thing, the first and last sentences of the "proof" are recognizably of the correct form. Moreover, the paragraph has something in common with the formal structure of a proof, and it's probably not entirely by chance that a few of the deductions are correct.
A close reading of this and similar ChatGPT utterances is also likely to shed considerable light on what mathematicians have in mind when we speak about “understanding,” as I have done frequently in this newsletter. Something is going on in that proof that points to an absence of “understanding,” and it does so in a way that should make it clear that substituting “reasoning” for “understanding,” as I have argued is likely to be the programmer’s first impulse, will not solve the problem.
But now it’s time for me to try to finish this month’s first real essay.
P.S.: FORGOT THIS
My first chat with ChatGPT was a version of the Winograd Schema Challenge, mentioned previously here.
Me: The umbrella didn't fit in the suitcase because it was too big. What was too big?
Open AI: The umbrella was too big to fit in the suitcase.
So far, so good.
Me: The mouse didn't fit in the cage because it was too small. What was too small?
Open AI: The mouse was too small to fit in the cage.
A few weeks later I tried this a second time, and it scored 100% on the challenge. I don’t know whether this is a statistical fluke or whether some human being tried to knock some common sense into the model in the meantime.
Searle's famous Chinese Room gedanken experiment shows how black box outputs can mimic "understanding" without there be even an iota of the same present.