Can a Red Shirt Prove All Ravens Are Black?
Why would a red shirt matter to a bird scientist?

Imagine you are a bird scientist trying to prove that all ravens are black. You travel through forests, search nests, and log every raven you see: black, black, black. One day, at home, you notice a bright red shirt in your laundry. Could that shirt help confirm your raven theory? It sounds ridiculous. Yet a simple rule about evidence says a red shirt—a non-black thing that is not a raven—supports the claim “All ravens are black.” This is the raven paradox, and it exposes how tricky it is to know when evidence really backs up a belief.
The logic works like this. The sentence “All ravens are black” can be rewritten as “If something is a raven, then it is black.” In logic, that statement is equivalent to its contrapositive: “If something is not black, then it is not a raven.” So finding a non-black thing that is not a raven seems to confirm the second claim, and thus the first. A red shirt is both non-black and not a raven. The philosopher Jean Nicod (1893–1924) proposed a simple test: a universal generalization is confirmed by each positive instance that fits it, as long as no counterexample turns up. Carl Hempel (1905–1997) famously examined this Nicod’s Criterion. The problem is that Nicod’s rule makes checking your closet just as useful as hiking through the woods. Nelson Goodman (1906–1998) joked about “indoor ornithology.” That can’t be right—or can it?
Can math solve the puzzle? Probability steps in.

Instead of a yes/no verdict, we can think of confirmation in degrees. A piece of evidence confirms a hypothesis if it makes the hypothesis more probable. In the early twentieth century, philosophers built a mathematical theory of probability on three simple rules. Probabilities are numbers between 0 (impossible) and 1 (certain). A tautology—a statement true by logic alone—has probability 1. And if two outcomes can’t both happen, their probabilities add up.
Conditional probability asks: how likely is one thing, given that another is true? It is written ( p(B \mid A) ) and equals the probability of both happening divided by the probability of ( A ). Degree of confirmation then is ( c(H,E) = p(H \mid E) - p(H) ). If this number is positive, the evidence ( E ) confirms the hypothesis ( H ).
A powerful result called Bayes’ theorem ties everything together:
[ p(H \mid E) = p(H) \times \frac{p(E \mid H)}{p(E)} ]
This one equation unites three commonsense ideas. First, theoretical fit: the better a hypothesis predicts the evidence (( p(E \mid H) ) high), the more confirmation it gets. Second, novelty: if the evidence is surprising (( p(E) ) low), confirmation is stronger. Third, prior plausibility: the more believable the hypothesis was to begin with (( p(H) )), the more believable it remains.
Now back to ravens. The quantitative approach says a red shirt does confirm “All ravens are black,” but by an astronomically small amount. There are so many non-black, non-raven things in the world that the hypothesis hardly gains from one more. The raven paradox is an illusion—we confuse a minuscule boost with zero boost. But here’s the catch: this resolution works only if we make certain assumptions about background probabilities. In some situations, even spotting a black raven could lower the probability that all ravens are black, if discovering any ravens at all makes it more likely that non-black ones exist too. So Nicod’s Criterion fails in general.
That reveals the problem of the priors: where do our starting probabilities come from? The probability axioms alone don’t tell us whether a red shirt matters or not. Philosophers split into two camps. Subjectivists say any initial probabilities are fine as long as they obey the three axioms—there’s no single “correct” starting point. Objectivists argue that a rule called the Principle of Indifference should set the initial probabilities: when no evidence favors one possibility over others, give them equal weight.
Hume’s head-scratcher: why trust the future to be like the past?

Even if we solve the priors problem, a deeper puzzle lurks. David Hume (1711–1776) argued that we have no logical reason to expect the future to resemble the past. Probability can show exactly why this is hard. Imagine you flip a coin ten times. The sequence could be any of 1024 patterns. If we treat every sequence as equally probable (each 1/1024), then after nine tails the chance of a tenth tail is still 1/2—same as heads. The first nine tosses teach us nothing.
But if we assign probabilities differently—grouping sequences by the number of tails, as Rudolf Carnap (1891–1970) suggested—after nine tails the probability of another tail jumps to 10/11. The probability axioms permit both choices. Induction—learning from past patterns—is not built into the math.
Carnap’s way relies on the Principle of Indifference (PoI): when you have no evidence favoring any possibility, treat them as equally likely. Yet the PoI gives contradictory answers depending on how you carve up the possibilities. In a three-horse race, the probability that Athena wins is 1/2 if you divide the outcomes into “Athena wins” and “Athena loses,” but 1/3 if you divide into “Athena wins,” “Beatrice wins,” “Cecil wins.” Even finer divisions—winning by a quarter length, an eighth, and so on—multiply the confusion. The same problem haunts assigning prior probabilities to coin tosses.
Subjectivists conclude that no single prior is rationally required; any prior is permissible. Induction is rational if you start with Carnap-like beliefs, but you could start as a skeptic without violating the rules. Objectivists keep searching for a principled version of the PoI.
A full theory of learning also needs a rule for updating beliefs when new evidence arrives. The standard rule, Conditionalization, says your new degree of belief in ( H ) after learning ( E ) should equal your old conditional probability ( p(H \mid E) ). That sounds obvious, but justifying it requires separate arguments—and those arguments remain fiercely debated.
Is there anything we cannot know, even in principle?

While Hume asks how we learn from experience, another puzzle asks whether some truths are forever beyond our reach. Alonzo Church and Frederic Fitch (1908–1987) discovered a startling result: if every truth could be known, then every truth would actually be known already. That can’t be right.
Here is the rough idea, using epistemic logic, a logic where ( K \phi ) means “somebody knows that ( \phi ).” Suppose everything true could be known—that is, whenever ( \phi ) is true, it is possible to know ( \phi ). Now take the sentence “( \phi ) is true and nobody knows ( \phi ).” If that were knowable, then you would have to know both parts: that ( \phi ) is true and that nobody knows ( \phi ). But knowing the first part means somebody knows ( \phi ) (you!), which contradicts the second part. So that sentence cannot be known. Therefore, if every truth were knowable, that sentence couldn’t be true—which forces the conclusion that whenever ( \phi ) is true, it is already known. That is absurd, so there must be unknowable truths. The knowability paradox shows knowledge has limits.
Limits appear even for self-knowledge. Many logicians once accepted the KK thesis: if you know something, you know that you know it (( K \phi \supset KK \phi )). Timothy Williamson (born 1955) challenged this with a simple scenario. Suppose you glance at a giant jar of jellybeans and can see there are at least 100. You know that. But do you know that you know there are at least 100? If you did, by a safety requirement—knowledge must not be easily mistaken—you’d also have to know there are at least 101, then 102, and so on, until you absurdly know there are more jellybeans than particles in the universe. So the KK thesis fails. You can know something without knowing that you know it; self-knowledge too has boundaries.
Why this matters for how you think every day

These puzzles are not just academic games. They shape how we should treat evidence, surprises, and our own certainty. The red shirt is technically evidence for “all ravens are black,” but it’s such tiny evidence that ignoring it is completely reasonable. A string of nine tails doesn’t force the next coin to be tails, yet most people would be foolish to bet against it—it’s rational because we all bring reasonable background assumptions to the table.
When you hear a surprising news story or spot a pattern in your favorite video game, you are using probability and induction without realizing it. The philosophy of knowledge helps you notice the hidden assumptions. It shows why some disputes can’t be settled just by piling up more facts, because the starting points matter too. And it reminds us that some truths may simply be out of reach, no matter how clever we are.
So next time you see a weird piece of evidence—a red shirt, a coin landing tails nine times, a friend’s unlikely story—remember: confirmation comes in degrees, and what looks like strong proof might be weaker than you think.
Think about it
- If you saw ten white ravens, would you still need to check every single raven in the universe before being sure “all ravens are black” is false? Why or why not?
- Suppose you have a lucky coin that has always come up heads. Is it irrational to bet it will be tails next time? Or can you rationally believe it’s still just chance?
- Can you think of something that is true but that nobody could ever know? If you can, does that mean there are unknowable truths?





