Skip to content
Philosophy for Kids

The Set That Wasn’t: How a Logical Contradiction Remade Math

A Letter That Shook the Foundations

One short letter in 1902 undid years of work on the logic of sets.

In June 1901, a young British philosopher named Bertrand Russell (1872–1970) was working on a huge project: to prove that all of mathematics could be built from pure logic. He wrote to the great German logician Gottlob Frege, who was about to publish the second volume of his own life’s work. Russell’s letter was polite, but it contained a bomb. He had found a simple rule that led straight to a contradiction. The rule seemed perfectly safe: for any clear property, you can form the set of all things that have it. But when Russell tried to form the set of all sets that do not contain themselves, logic twisted into a knot.

To understand why, imagine a set as a collection. A set of pens, a set of even numbers, or even a set of sets. Most sets don’t contain themselves. The set of all pens is not itself a pen, so it does not belong to itself. Now consider the set of all sets that are not members of themselves — call it R. Ask the question: does R belong to R? If it does, then R must satisfy the condition “does not contain itself,” so it doesn’t. If it doesn’t, then it satisfies the condition, so it does. Both answers force the opposite answer. That is a paradox: a statement that leads to a contradiction from seemingly correct reasoning.

Russell wasn’t alone. The mathematician Georg Cantor (1845–1918), the creator of set theory, had already stumbled into similar problems with the “set of all sets” and the collection of all ordinal numbers. The rules that everyone had trusted for defining sets suddenly looked like traps.

The Trouble with “All”

The idea of a set of all sets turned out to be too big to hold without breaking.

Why did these contradictions appear? The common ingredient is self-reference — talking about a collection as if it is already finished, and then asking a question that points right back at the collection itself. Cantor had proved that for any set, the set of all its subsets must be strictly larger. So the “set of all sets” would have to be the biggest possible set, but its power set would be even bigger — a contradiction. The only way out was to deny that such a totality can be treated as a finished, well-behaved set.

Cantor introduced a distinction that was still blurry in 1900: some totalities are too vast to be sets; they are proper classes. You can talk about “all sets,” but you cannot treat the whole thing as a single thing that can itself belong to other collections. The paradoxes didn’t destroy set theory; they revealed that the word “set” needed clearer rules.

Meanwhile, mathematicians in France were finding paradoxes that didn’t involve gargantuan infinities. They turned on the idea of definability — what can be named with words.

Words That Eat Themselves

Berry’s sentence fits on a single card but still manages to trap itself.

Take the ancient Liar paradox: a person says “I am lying.” If he is telling the truth, then he is lying; if he is lying, then he is telling the truth. The sentence seems to twist around itself. Russell realized that similar loops appear in mathematics when we talk about “definable” numbers.

His colleague G. G. Berry, a librarian at Oxford, sharpened the point. Consider the natural numbers that can be defined in English using fewer than 18 syllables. That set is finite, so there must be a smallest number that is not definable under this limit. But the very phrase “the least number not definable under eighteen syllables” defines that number — and it uses only 17 syllables. So the number both is and is not in the set.

The problem here is not size but circularity: the definition sneaks a reference to the whole collection of that kind of definition, including itself. The French mathematician Henri Poincaré (1854–1912) argued that such impredicative definitions — ones that refer to the whole set to which the thing being defined belongs — are viciously circular and should be banned. To him, a mathematical object doesn’t really exist unless you can build it step by step without looping back.

Building a Safe Tower

Russell’s type theory arranges mathematical objects in layers, so no collection can refer to itself.

Russell and the logician Alfred North Whitehead (1861–1947) tried to solve all the paradoxes at once by constructing a ramified theory of types. The idea is that every object, set, or propositional function comes with a level — a type. Individuals (things that are not collections) sit at the bottom. Sets of individuals are one type up. Sets of sets of individuals are yet higher. A set can only contain things of a lower type, so a set can never belong to itself. The Liar, too, is blocked: when you say “I am lying,” you are making a statement that refers to all statements of a certain order; but that statement itself must be of a higher order, so the Liar is simply false and not truly contradictory.

At the same time, the German mathematician Ernst Zermelo (1871–1953) offered a different repair. He wrote down a list of axioms (basic starting rules) for set theory that carefully restricted how you could form new sets. Instead of “any property defines a set,” Zermelo’s separation axiom said you could only carve out subsets from a set that already exists. That blocked the Russell set: you can’t form the set of all sets that don’t contain themselves because there is no universal set to carve it from. Both approaches — types and axioms — gave mathematicians a way to keep doing mathematics while avoiding the traps.

Can a Language Define Its Own Truth?

Tarski showed that you can only define truth for a language from a richer language one level above it.

The semantic paradoxes pushed the Polish logician Alfred Tarski (1901–1983) to tackle a deep question: can a language contain its own truth predicate without leading to contradiction? He starts from a perfectly natural demand. If a sentence is true, it should correspond to the world. So we want that for any sentence S, the statement “S is true” means the same as S itself. That gives us the famous T‑schema: “Snow is white” is true if and only if snow is white.

But now suppose our language can talk about its own sentences, and it contains a truth predicate that applies to them. We can construct a Liar sentence L that says “L is not true.” By the T‑schema, L is true if and only if L is not true — contradiction. Tarski’s conclusion: no consistent language can contain its own truth predicate and also obey classical logic while allowing self‑referential sentences. To talk about truth, you must move to a metalanguage — a richer language outside the original one. There is no final language in which you can define truth for absolutely everything.

This result wasn’t a defeat; it was a precise limit. It gave logicians a clear picture of how semantics — the study of meaning — can be formalized without paradox.

Why the Ripples Still Reach You Today

The rules invented to stop paradoxes are built into the programming languages and databases you use every day.

The paradoxes that shook mathematics between 1897 and 1930 didn’t just vanish. They became the scaffolding for modern logic, computer science, and even our understanding of knowledge. When you use a programming language like Java or Python, you’re working inside a type system that prevents you from writing self‑contradictory instructions — a direct descendant of Russell’s types. Searches in databases rely on set theory freed from the Russell set.

Even in everyday thinking, the paradoxes teach a lasting lesson: be careful when you use words such as “all” or “definable” in a way that can loop back onto itself. If you say “everything I say is false,” you’re not making a meaningful statement; you’re tying your own tongue. Philosophers still argue about whether there is a single perfect solution to the Liar, or whether truth is always somewhat incomplete. But the crisis of the early twentieth century gave us an invaluable gift: a map of where logical traps live, and a set of tools to build around them.

Think about it

  1. If you wrote a computer program that tried to list all programs that never list themselves, would you hit the same kind of problem? Why or why not?
  2. Could there be a secret rule that avoids all paradoxes, or does every attempt to talk about “everything” eventually create a loop?
  3. Imagine a friend says, “Everything on the internet is fake.” If that statement itself is on the internet, should you believe it? What does that tell you about sweeping claims?