Can a Sentence Tell You It’s True? Alfred Tarski’s Puzzle
A Puzzle He Brought With Him

August 1939. Alfred Tarski (1901–1983), a Polish mathematician, boarded a ship to the United States for a conference. He didn’t know that World War II would break out before he could return. He would be stuck in America for years, separated from his wife Maria and their two children, Ina and Jan. But Tarski wasn’t just carrying a suitcase — he was carrying a puzzle that had been bothering him for over a decade.
The puzzle started with a sentence like this: “This very sentence is false.” If the sentence is true, then it must be false; if it’s false, then it must be true. This is the Liar Paradox, and it’s been tying thinkers in knots since ancient Greece. But Tarski wasn’t satisfied with just calling it a weird trick. He wanted a precise, mathematical answer: how can we say that a sentence in a formal language is true without falling into this trap?
At the time, mathematicians had built powerful formal systems to reconstruct all of classical mathematics — like Russell and Whitehead’s type theory or Zermelo’s set theory. But no one had defined the notion of “true sentence” inside those systems in a rigorous way. The word “truth” seemed too messy. Tarski believed you could clean it up, if you were careful. So he set out to do it.
The Definition That Almost Worked

Tarski’s big idea was to define truth only for a specific formal language, using a separate, more powerful language to talk about it. He called the language being described the object language, and the language doing the describing the metalanguage. It’s like a video game character: the character can’t see the code that makes its world, but a programmer outside the game can.
For a simple language — what he called the “language of the calculus of classes” (LCC) — he showed how to build a truth predicate (Tr) inside the metalanguage. The definition had to meet a condition he called Convention T: for every sentence s of the object language, the metalanguage must prove:
Tr(“s”) if and only if p
where “s” is a name for the sentence and p is its translation into the metalanguage. So, if s is a sentence saying “All whales are mammals”, the metalanguage must prove that Tr(“All whales are mammals”) is true exactly when all whales really are mammals. This captures the plain, everyday idea that a sentence is true if things are the way the sentence says they are.
How did he actually build this predicate? He defined a new notion: satisfaction. Imagine an infinite sequence of objects being assigned to the variables in a formula. A formula like “x is smaller than y” is satisfied by a sequence if the object assigned to x really is smaller than the object assigned to y. Then a sentence (which has no free variables) is true if it is satisfied by every possible sequence. Tarski gave a mathematically precise, recursive definition of satisfaction for LCC, and from it he defined truth.
It worked beautifully for that simple language. All the biconditionals required by Convention T could be proved, and the liar paradox couldn’t sneak in — because the truth predicate Tr existed only in the metalanguage, not in the object language. A sentence like “I am not true” couldn’t even be formed in LCC, since Tr wasn’t part of LCC’s vocabulary.
Why You Can’t Catch Your Own Tail

But Tarski soon hit a wall. What about a richer language, one that could talk about all levels of classes — individuals, classes of individuals, classes of classes, and so on without end? He called this the “language of the general theory of classes” (LGTC). Could you define a truth predicate for LGTC using a metalanguage that had the same mathematical power as LGTC itself? That is, could you catch truth while staying inside the same kind of mathematical system?
Tarski proved that you cannot. This is his famous Indefinability Theorem. If you try to define a truth predicate Tr for LGTC inside a metalanguage that is just LGTC plus a theory of its own syntax, you will inevitably end up deriving a contradiction — a sentence that is true if and only if it is false.
The proof uses a technique invented by Kurt Gödel: you can encode sentences as numbers, then construct a sentence that “says” of itself that it is not in the set of true sentences. It’s the liar paradox, now locked inside a perfectly formal system. The upshot is devastating: no consistent language that is sufficiently expressive can contain its own truth predicate. You always need a more powerful metalanguage to define truth for a given language.
Tarski later showed that if you allow a much more powerful metatheory — like using set theory with transfinite types — you can define a truth predicate for LGTC. But the general lesson stands: truth for a system cannot be fully captured by the system itself. The ladder of truth always needs a higher rung.
When Following Rules Isn’t Enough

Tarski wasn’t done. In 1936, he applied his new semantic tools to another big question: what does it mean for one sentence to be a logical consequence of others? Before him, many logicians thought that a sentence X follows logically from a set of sentences K just when you can derive X from K using a fixed set of mechanical rules of inference. But Tarski pointed out a problem.
There are theories (like LGTC) where you can prove each of an infinite list of particular sentences: “0 has property P”, “1 has property P”, “2 has property P”, and so on, for every natural number — yet you can’t prove the universal sentence “Every natural number has property P” using those rules. Intuitively, that universal sentence must be true if all the particular ones are true. So the rule‑based notion of consequence is too weak.
Tarski proposed a different, semantic definition: X is a logical consequence of K if and only if every model (interpretation) that makes all sentences in K true also makes X true. He built this using his definition of satisfaction. For a given language, an interpretation assigns a domain of objects and meanings to the non-logical constants. A sentence is a logical truth if it is true under every possible interpretation.
This definition captures the everyday idea that a good argument should preserve truth no matter how you reinterpret the non-logical words. Crucially, it also handles the problem of ω‑incompleteness: the universal sentence “Every natural number has P” will be true in every model where all the specific sentences are true, because a model that gave a counterexample would have to contain a natural number without P — contradicting the fact that every particular number has P in that model.
Which Words Are Truly Logical?

There was still a catch. Tarski’s definition of logical consequence depends on deciding which words are “logical constants” (like “and”, “or”, “all”) and which are “extra-logical” (like “cat” or “red”). Without a clear dividing line, the definition stays fuzzy.
Later in his career, in a 1966 lecture, Tarski offered a striking solution. Think of all the objects in the universe. Now imagine reshuffling them — a permutation that maps every object to a different one, but in a one‑to‑one way. A logical notion, Tarski proposed, is one that stays the same under every possible permutation of the universe. The truth‑functional connectives, the quantifiers “for all” and “there exists,” and even the membership relation of type theory all turn out to be invariant in this way. They don’t care which object is which; they only care about form.
But here’s the twist: in ordinary first‑order set theory, the membership predicate (∈) is not invariant under all permutations, because it matters which specific objects are members of which. So membership can be treated as logical if you build your theory one way (in the theory of types), and as non‑logical if you build it another way (in standard set theory). Tarski was comfortable with this. The boundary between logical and extra‑logical terms might shift depending on what you take as your starting point. There’s no single, eternal list of logical words.
Why This Still Haunts Us

Tarski’s work didn’t just tidy up a philosophical puzzle; it set the foundations for modern logic, computer science, and our understanding of language. The indefinability theorem shows that any formal system powerful enough to talk about basic arithmetic can’t fully describe its own truth — a limit that echoes Gödel’s incompleteness theorems and appears whenever self‑reference meets precision.
When you play a video game and a character says “I know I’m in a simulation,” the game’s code can’t make that statement true about the code itself without risking a kind of logical crash. Tarski’s insight is that truth is always a step outside the system you’re looking at. It’s a humbling reminder: no map can perfectly represent the territory while being part of it.
So next time you read a sentence like “This sentence is false,” don’t just get dizzy — remember the mathematician who crossed an ocean, lost his home, and built a ladder of truths that, at the top, still pointed to a higher rung.
Think about it
- If a video game character said “I am not part of any game,” could that sentence be true inside the game’s own world? Why or why not?
- Can you think of a rule‑based system (like a board game or a computer program) where some obvious truth can’t be proved using only the official rules?
- Tarski’s definition of a logical word depends on what you treat as part of the “universe.” Do you think the word “is” should be logical in every context, or could it sometimes be just a regular word? Why?





