Maths fundamental flaw: a brief explanation

Photo of author


There is a hole at the bottom of math, a hole that means we will never know everything with certainty. There will always be true statements that cannot be proven. Now, no one knows what those statements are exactly, but they could be something like the Twin Prime Conjecture. Twin primes are prime numbers that are separated by just one number, like 11 and 13, or 17 and 19. And as you go up the number line, primes occur less frequently, and twin primes are rarer still.

But the Twin Prime Conjecture is that there are infinitely many twin primes. You never run out of them. As of right now, no one has proven this conjecture true or false. But the crazy thing is this: we may never know because what has been proven is that in any system of mathematics where you can do basic arithmetic, there will always be true statements that are impossible to prove. That is life.

Specifically, this is the Game of Life, created in 1970 by mathematician John Conway. Sadly, he passed away in 2020 from Covid-19. Conway’s Game of Life is played on an infinite grid of square cells, each of which is either live or dead, and there are only two rules.

One: any dead cell with exactly three neighbors comes to life. Two: any living cell with fewer than two or more than three neighbors dies. Once you’ve set up the initial arrangement of cells, the two rules are applied to create the next generation, and then the one after that, and the one after that, and so on. It’s totally automatic. Conway called it a zero-player game.

But even though the rules are simple, the game itself can generate a wide variety of behavior. Some patterns are stable; once they arise, they never change. Others oscillate back and forth in a loop. A few can travel across the grid forever, like this glider here. Many patterns just fizzle out, but a few keep growing forever; they keep generating new cells.

Now, you would think that given the simple rules of the game, you could just look at any pattern and determine what will happen to it. Will it eventually reach a steady state, or will it keep growing without limit? But it turns out this question is impossible to answer. The ultimate fate of a pattern in Conway’s Game of Life is undecidable, meaning there is no possible algorithm that is guaranteed to answer the question in a finite amount of time. You could always just try running the pattern and see what happens. I mean, the rules of the game are a kind of algorithm after all. But that’s not guaranteed to give you an answer either because even if you run it for a million generations, you won’t be able to say whether it’ll last forever or just 2 million generations or a billion or a googolplex.

Is there something special about the Game of Life that makes it undecidable? Nope, there are actually a huge number of systems that are undecidable, like Wang tiles, quantum physics, airline ticketing systems, and even Magic: The Gathering. To understand how undecidability shows up in all of these places, we have to go back 150 years to a full-blown revolt in mathematics.

In 1874, Georg Cantor, a German mathematician, published a paper that launched a new branch of mathematics called set theory. A set is just a well-defined collection of things, so the two shoes on your feet are a set, as are all the planetariums in the world. There’s a set with nothing in it, the empty set, and a set with everything in it.

Real number between zero and one

Now, Cantor was thinking about sets of numbers, like natural numbers (positive integers like 1, 2, 3, 4) and real numbers (which include fractions like a third, five halves, and also irrational numbers like pi, e, and the square root of two—basically any number that can be represented as an infinite decimal). He wondered, are there more natural numbers or more real numbers between zero and one? The answer might seem obvious; there are an infinite number of each, so both sets should be the same size. But to check this logic, Cantor imagined writing down an infinite list, matching up each natural number on one side with a real number between zero and one on the other.

Now, since each real number is an infinite decimal, there is no first one, so we can just write them down in any random order. The key is to make sure we get them all with no duplicates and line them up one to one with an integer. If we can do that with none left over, well then we know that the set of natural numbers and the set of real numbers between 0 and 1 are the same size.

So, assume we’ve done that. We have a complete infinite list with each integer acting like an index number, a unique identifier for each real number on the list. Now, Cantor says start writing down a new real number, and the way we’re going to do it is by taking the first digit of the first number and adding one, then take the second digit of the second number and again add one, take the third digit of the third number, add one, and keep doing this all the way down the list. If the digit is a nine, just roll it back to an eight.

By the end of this process, you’ll have a real number between zero and one. But here’s the thing, this number won’t appear anywhere on our list. It’s different from the first number in the first decimal place, different from the second number in the second decimal place, and so on down the line. It has to be different from every number on the list by at least one digit—the number on the diagonal. That’s why this is called Cantor’s diagonalization proof. It shows there must be more real numbers between 0 and 1 than there are natural numbers, extending out to infinity.

So, not all infinities are the same size. Cantor called these countable and uncountable infinities, respectively. And in fact, there are many more uncountable infinities, which are even larger.

Now, Cantor’s work was just the latest blow to mathematics. For 2000 years, Euclid’s Elements were considered the bedrock of the discipline. But at the turn of the 19th century, Lobachevsky and Gauss discovered non-Euclidean geometries, prompting mathematicians to examine more closely the foundations of their field. They did not like what they saw; the idea of a limit at the heart of calculus turned out to be poorly defined. And now, Cantor was showing that infinity itself was much more complex than anyone had imagined.

In all this upheaval, mathematics fractured, and a huge debate broke out among mathematicians at the end of the 1800s. On one side were the intuitionists who thought that Cantor’s work was nonsense. They were convinced that math was a pure creation of the human mind and that infinities like Cantor’s weren’t real. Henri Poincaré said that later generations will regard set theory as a disease from which one has recovered. Leopold Kronecker called Cantor a scientific charlatan and a corrupter of the youth, and he worked to keep Cantor from getting a job. On the other side were the formalists; they thought that math could be put on absolutely secure logical foundations through Cantor’s set theory. The informal leader of the formalists was the German mathematician David Hilbert. Hilbert was a living legend, a hugely influential mathematician who had worked in nearly every area of mathematics. He almost beat Einstein to the punch on general relativity; he developed entirely new mathematical concepts that were crucial for quantum mechanics, and he knew that Cantor’s work was brilliant.

Hilbert was convinced that a more formal and rigorous system of mathematical proof based on set theory could solve all the issues that had cropped up in math over the last century, and most other mathematicians agreed with him. “No one shall expel us from the paradise that Cantor has created,” Hilbert declared. But in 1901, Bertrand Russell pointed out a serious problem in Cantor’s set theory. Russell knew that if sets can contain anything, they can contain other sets or even themselves. For example, the set of all sets must contain itself, as does the set of sets with more than five elements in them. You could even talk about the set of all sets that contain themselves, but this leads straight to a problem: What about ‘r,’ the set of all sets that don’t contain themselves? If ‘r’ doesn’t contain itself, well, then it must contain itself. But if ‘r’ does contain itself, then by definition, it must not contain itself. So ‘r’ contains itself if and only if it doesn’t. Russell had found another paradox of self-reference.

He later explained his paradox using a hairy analogy. Let’s say there’s a village populated entirely by grown men with a strange law against beards. Specifically, the law states that the village barber must shave all and only those men of the village who do not shave themselves. But the barber himself lives in the village too, of course, and he’s a man. So who shaves him? If he doesn’t shave himself, then the barber has to shave him. But the barber can’t shave himself because he doesn’t shave anyone who shaves themselves. So the barber must shave himself if and only if he doesn’t shave himself. It’s a contradiction.

The intuitionists rejoiced at Russell’s paradox, thinking it had proven set theory hopelessly flawed. However, Zermelo and other mathematicians from Hilbert’s school solved the problem by restricting the concept of a set. The collection of all sets, for example, is not a set anymore, and neither is the collection of all sets that don’t contain themselves. This eliminated the paradoxes that come with self-reference. Hilbert and the formalists lived to fight another day, but self-reference refused to die quite that easily.

Fast forward to the 1960s, and mathematician Hao Wang was looking at square tiles with different colors on each side, like these. The rules were that touching edges must be the same color, and you can’t rotate or reflect tiles, only slide them around. The question was, if you’re given an arbitrary set of these tiles, can you tell if they will tile the plane? That is, will they connect up with no gaps all the way out to infinity? It turns out you can’t tell for an arbitrary set of tiles whether they will tile the plane or not. The problem is undecidable, just like the fate of a pattern in Conway’s Game of Life. In fact, it’s exactly the same problem, and that problem ultimately comes from self-reference, as Hilbert and the formalists were about to discover.

Hilbert wanted to secure the foundations of mathematics by developing a new system for mathematical proofs. Systems of proof were an old idea, going back to the ancient Greeks. A system of proof starts with axioms, basic statements that are assumed to be true, like ‘a straight line can be drawn between any two points.’ Proofs are then constructed from those axioms using rules of inference, methods for using existing statements to derive new statements, and these are chosen to preserve truth: if the existing statements are true, then so are the new ones.

Hilbert wanted a formal system of proof, a symbolic logical language with a rigid set of manipulation rules for those symbols. Logical and mathematical statements could then be translated into this system. ‘If you drop a book, then it will fall’ would be A, then B, and ‘No human is immortal’ would be expressed like this. Hilbert and the formalists wanted to express the axioms of mathematics as symbolic statements in a formal system and set up the rules of inference as the system’s rules for symbol manipulation.

So, Russell, along with Alfred North Whitehead, developed a formal system like this in their three-volume Principia Mathematica, published in 1913. Principia Mathematica is vast, a total of nearly 2,000 pages of dense mathematical notation. It takes 762 pages just to get to a complete proof that one plus one equals two, at which point Russell and Whitehead dryly note, ‘The above proposition is occasionally useful.’ The authors had originally planned a fourth volume, but unsurprisingly, they were too worn out to complete it. So yes, the notation is dense and exhausting, but it is also exact. Unlike ordinary languages, it leaves no room for errors or fuzzy logic to creep in. And most importantly, it allows you to prove properties of the formal system itself.

There were three big questions that Hilbert wanted answered about mathematics. Number one: Is math complete, meaning is there a way to prove every true statement? Does every true statement have a proof? Number two: Is mathematics consistent, meaning is it free of contradictions? I mean, if you can simultaneously prove A and not A, then that’s a real problem because you can prove anything at all. And number three: Is math decidable, meaning is there an algorithm that can always determine whether a statement follows from the axioms?
Now, Hilbert was convinced that the answers to all three of these questions were yes. At a major conference in 1930, Hilbert gave a fiery speech about these questions. He ended it with a line that summed up his formalist dream in opposition to the foolish “ignorabimus,” which means “we will not know.” Our slogan shall be, ‘We must know; we will know.’ These words are literally on his grave. But by the time Hilbert gave this speech, his dream was already crumbling. Just the day before, at a small meeting at the same conference, a 24-year-old logician named Kurt Gödel explained that he had found the answer to the first of Hilbert’s three big questions about completeness, and the answer was no. A complete formal system of mathematics was impossible.

incompleteness theorem

The only person who paid much attention was John von Neumann, one of Hilbert’s former students who pulled Gödel aside to ask a few questions. But the next year, Gödel published a proof of his incompleteness theorem, and this time everyone, including Hilbert, took notice.

Gödel number

This is how Gödel’s proof works: Gödel wanted to use logic and mathematics to answer questions about the very system of logic and mathematics. So he took all these basic symbols of a mathematical system and then gave each one a number. This is known as the symbol’s Gödel number. So, the symbol for ‘not’ gets the number one, ‘or’ has the Gödel number two, and ‘if then’ gets Gödel number three. Now, if you’re expressing all these symbols with numbers, then what do you do about the numbers themselves? Well, zero gets its own Gödel number, six, and if you want to write a one, you just put this successor symbol next to it. The immediate successor of zero is 1. And if you want to write 2, then you have to write ‘ss0,’ and that represents 2, and so on. So, you could represent any positive integer this way. Granted, it is cumbersome, but it works, and that is the point of this system.

Now that we have Gödel numbers for all the basic symbols and all the numbers we might want to use, we can start to write equations. For example, we could write ‘zero equals zero.’ These symbols have the Gödel numbers six and five six, and we can actually create a new card that represents this equation ‘zero equals zero.’ The way we do it is we take the prime numbers starting at 2 and raise each one to the power of the Gödel number of the symbol in our equation. So, we have 2 to the power of 6 times 3 to the power of 5 times 5 to the power of 6 equals 243 million. So, 243 million is the Gödel number of the whole equation ‘zero equals zero.’

We can write down Gödel numbers for any set of symbols you can imagine. So, really, this is an infinite deck of cards where any set of symbols that you want to write down in a sequence can be represented with a unique Gödel number. The beauty of this Gödel number is you can do a prime factorization of it and work out exactly what symbols made up this card. In this whole pack of cards, there are going to be true statements and there are going to be false statements.

So, how do you prove that something is true? Well, you need to go to the axioms. The axioms also have their own Gödel numbers, formed in the same way. Here’s an axiom which says, ‘not the successor of any number x equals zero,’ which makes sense because in this system, there are no negative numbers, and so the successor of any number cannot be zero. We can put down this axiom and then substitute in zero for x, saying that ‘one does not equal zero,’ and we’ve created a proof. This is the simplest proof I can think of that shows that ‘one does not equal zero.’

This card for the proof of ‘one does not equal zero’ gets its own Gödel number, and the way we calculate this Gödel number is just as before. We take the prime numbers and raise 2 to the power of the axiom times 3 to the power of ‘one does not equal 0,’ and we get a tremendously large number. It would have 73 million digits if you calculated it all out. So, I’ve just left it here in exponential notation. As you can see, these numbers get very large, and so you might want to start just calling them by letters. So we could say this one is Gödel number ‘a,’ this is Gödel number ‘b,’ Gödel number ‘c,’ and so on.
Gödel goes to all this trouble to find this card which says there is no proof for the statement with Gödel number ‘g.’ The trick is that the Gödel number of this card is ‘g,’ so what this statement is really saying is this card is unprovable. There is no proof anywhere in our infinite deck for this card. Now, think about it. If it’s false and there is a proof, then what you have just proven is there is no proof, so you’re stuck with a contradiction. That would mean the mathematical system is inconsistent. The other alternative is that this card is true. There is no proof for the statement with Gödel number ‘g,’ but that means this mathematical system has true statements in it that have no proof. So, your mathematical system is incomplete, and that is Gödel’s incompleteness theorem. This is how he shows that any basic mathematical system that can do fundamental arithmetic will always have statements within it which are true but which have no proof.

There’s a line in the TV show The Office that echoes the self-referential paradox of Gödel’s proof: ‘Jim is my enemy, but it turns out that Jim is also his own worst enemy, and the enemy of my enemy is my friend. So Jim is actually my friend, but because he is his own worst enemy, the enemy of my friend is my enemy. So, actually, Jim is my enemy.’ But Gödel’s incompleteness theorem means that truth and provability are not at all the same thing. Hilbert was wrong; there will always be true statements about mathematics that just cannot be proven.

Now, Hilbert could console himself with the hope that at least we could still prove math’s consistency, that is, free of contradictions. But then Gödel published his second incompleteness theorem in which he showed that any consistent formal system of math cannot prove its own consistency. So, taken together, Gödel’s two incompleteness theorems say that the best you can hope for is a consistent yet incomplete system of math. But a system like that cannot prove its own consistency, so some contradiction could always crop up in the future revealing that the system you’d been working with had been inconsistent the whole time.

And that leaves only the third and final big question from Hilbert: Is mathematics decidable? That is, is there an algorithm that can always determine whether a statement follows from the axioms? In 1936, Alan Turing found a way to settle this question. But to do it, he had to invent the modern computer. In his time, computers weren’t machines; they were people, often women, who carried out long and tedious calculations. Turing imagined an entirely mechanical computer. He wanted it to be powerful enough to perform any computation imaginable but simple enough that you could reason through its operation. So, he came up with a machine that takes as input an infinitely long tape of square cells, each containing a zero or a one. The machine has a read-write head that can read one digit at a time. Then it can perform one of only a few tasks: overwrite a new value, move left, move right, or simply halt (halting means the program has run to completion). The program consists of a set of internal instructions, you can think of it like a flow chart that tells the machine what to do based on the digit it reads and its internal state. You could imagine exporting these instructions to any other Turing machine, which would then perform in exactly the same way as the first. Although this sounds simple, a Turing machine’s arbitrarily large memory and program mean it can execute any computable algorithm if given enough time, from addition and subtraction to the entire YouTube algorithm; it can do anything a modern computer can. That’s what made the Turing machine so useful in answering Hilbert’s question on decidability.

When a Turing machine halts, the program has finished running, and the tape is the output. But sometimes a Turing machine never halts; maybe it gets stuck in an infinite loop. Is it possible to tell beforehand if a program will halt or not on a particular input? Turing realized this halting problem was very similar to the decidability problem. If he could find a way to figure out if a Turing machine would halt, then it would also be possible to decide if a statement followed from the axioms. For example, you could solve the twin prime conjecture by writing a Turing machine program that starts with the axioms and constructs all theorems that can be produced in one step using the rules of inference.
Then it constructs all theorems that can be produced in one step from those and so on. Each time it generates a new theorem, it checks if it’s the twin prime conjecture, and if it is, it halts. If not, it never halts. So if you could solve the halting problem, you could solve the twin prime conjecture and all sorts of other unsolved questions. Turing said, let’s assume we can make a machine ‘h’ that can determine whether any Turing machine will halt or not on a particular input. You insert the program and its input, and ‘h’ simulates what will happen, printing out either ‘halts’ or ‘never halts.’ For now, we don’t worry about how ‘h’ works; we just know that it always works. It always gives you the right answer.

Now we can modify the ‘h’ machine by adding additional components. One, if it receives the output ‘halts,’ immediately goes into an infinite loop. Another, if it receives ‘never halts,’ then it immediately halts. We can call this entire new machine ‘h plus,’ and we can export the program for that entire machine. Now, what happens if we give this machine its own code, both as program and input? Well, now ‘h’ is simulating what ‘h plus’ would do given its own input. Essentially, ‘h’ has to determine the behavior of a machine that it itself is a part of. In this exact circumstance, if ‘h’ concludes that ‘h plus’ never halts, well, this makes ‘h plus’ immediately halt. If ‘h’ thinks ‘h plus’ will halt, well, then that necessarily forces ‘h plus’ to loop. Whatever output the halting machine ‘h’ gives, it turns out to be wrong. There’s a contradiction. The only explanation can be that a machine like ‘h’ can’t exist. There’s no way to tell in general if a Turing machine will halt or not on a given input, and this means mathematics is undecidable. There is no algorithm that can always determine whether a statement is derivable from the axioms.

So, something like the twin prime conjecture might be unsolvable. We might never know whether there are infinite twin primes or not. The problem of undecidability even crops up in physical systems. In quantum mechanics, one of the most important properties of a many-body system is the difference in energy between its ground state and its first excited state, known as the spectral gap. Some systems have a significant spectral gap, whereas others are gapless. There’s a continuum of energy levels all the way down to the ground state. This is important because, at low temperatures, gapless quantum systems can undergo phase transitions, while gapped systems cannot. They don’t have the energy needed to overcome the spectral gap. But figuring out if a system is gapped or gapless has long been known to be a very difficult problem. Only recently, in 2015, did mathematicians prove that, in general, the spectral gap question is undecidable. To quote the authors, ‘even a perfect complete description of the microscopic interactions between a material’s particles is not always enough to deduce its macroscopic properties.’

Now, remember that Turing designed his machines to be as powerful computers as it is possible to make. To this day, the best computational systems are those that can do everything a Turing machine can. This is called Turing completeness, and it turns out there are many such Turing complete systems. Although they are powerful, every Turing complete system comes with a catch, its own analog of the halting problem, some undecidable property of the system. Wang tiles are Turing complete; their halting problem is whether or not they’ll tile the plane. Complex quantum systems are Turing complete, and their halting problem is the spectral gap question. And the Game of Life is Turing complete; its halting problem is literally whether or not it halts. There are many more examples of this, like airline ticketing systems, Magic: The Gathering, PowerPoint slides, and Excel spreadsheets. Nearly every programming language in existence is designed to be Turing complete, but in theory, we only need one programming language because, well, you can program anything at all using any Turing complete system. So here’s a Turing machine inside the Game of Life, and since the Game of Life is itself Turing complete, it should be capable of simulating itself, and indeed it is. This is the Game of Life running on the Game of Life.
The real legacy of David Hilbert’s dream is all of our modern computational devices. Kurt Gödel suffered bouts of mental instability later in life, convinced that people were trying to poison him; he refused to eat, eventually starving himself to death. Hilbert died in 1943; his epitaph was his slogan from 1930: ‘We must know, we will know.’ The truth is we don’t know; sometimes we can’t know. But in trying to find out, we can discover new things, things that change the world.

Alan Turing put his ideas about computing to practical use in World War II, leading the team at Bletchley Park that built real calculating machines to crack Nazi codes for the Allies. By one estimate, the intelligence Turing and his colleagues gathered from decrypted messages shortened the war by two to four years. After the war, Turing and John von Neumann designed the first true programmable electronic computer, ENIAC, based on Turing’s designs. But Turing didn’t live to see his ideas develop much further. In 1952, the British government convicted him of gross indecency upon learning he was gay; he was stripped of his security clearance and forced to take hormones. In 1954, he committed suicide.

Turing changed the world. He is widely considered to be the most important founding figure in computer science. All modern computers are descended from his designs. But Turing’s ideas about computability came from his concept of the Turing machine, which came from thinking about Hilbert’s question: Is math decidable? So Turing’s code-breaking machines and, indeed, all modern computers stem from the weird paradoxes that arise from self-reference.

There is a hole at the bottom of math, a hole that means we will never know everything with certainty. There will always be true statements that cannot be proven. And you might think this realization would have driven mathematicians mad and led to the disintegration of the entire mathematical enterprise. But instead, thinking about this problem transformed the concept of infinity, changed the course of a world war, and led directly to the invention of the device you’re watching this on right now.

Join us on Facebook

Join us on X

1 thought on “Maths fundamental flaw: a brief explanation”

Leave a comment