How Sure Can You Be? The Hidden Math Behind 'Evidence'
The Eclipse That Weighed a Theory

On a spring day in 1919, two teams of astronomers pointed their cameras at the sun. They had traveled to remote corners of the world — one group to an island off Africa, the other to Brazil — waiting for a total eclipse. Their goal: catch the positions of stars that appeared near the sun’s edge and measure if their light had been bent by gravity. This single measurement would decide between two competing pictures of the universe. Albert Einstein’s brand-new General Theory of Relativity (GTR) said space-time near the sun is curved, so starlight should be deflected by about 1.75 arcseconds. The older Newtonian theory allowed either no deflection or, at most, 0.875 arcseconds. When the results came in, the numbers leaned toward Einstein: one team measured 1.61 arcseconds, the other 1.98. The evidence seemed to support the wild idea of curved space. But how sure could they really be? Could a few arcseconds overturn a theory that had worked for centuries? This is the hardest problem in any kind of detective work — scientific or otherwise — and it’s what a branch of logic called inductive inference tries to solve. The most widely used tool for this job is named after an 18th-century mathematician, Thomas Bayes (1701–1761), and it’s the hidden engine behind everything from medical diagnosis to the fairness of a coin.
From Hunches to Numbers: The Recipe for a Good Argument

When you wonder whether a friend is at the library or the park, you might check if her bike is gone. You know that she almost always takes her bike to the park. The presence of the missing bike is evidence, and you weigh it against your background knowledge. Bayesian inductive logic turns this everyday weighing into a number. The goal is to express how strongly some claim — the hypothesis — is supported by a set of facts (the evidence). The logic uses exactly the same idea as betting odds: if you would bet two dollars for every one dollar on a hunch, you think your hunch has a probability of about 2 out of 3. Bayes’ framework starts with a simple notation. P( C | D ) means “the probability of C given D” — in other words, if D is all the information you have, how likely is C? This number always falls between 0 (impossible) and 1 (certain). For any hypothesis, we ask: what is the probability that it is true, given the evidence we’ve observed? This is called the posterior probability. To get there, we need two ingredients. First, the likelihood: how probable the evidence would be if the hypothesis were true. Second, the prior probability: how plausible the hypothesis was before you saw that evidence, based on general knowledge, simplicity, or past patterns. The heart of the whole system is an equation that weaves them together.
The Coin Toss That Spoke — Loudly

Suppose you have a coin. Someone says it’s fair (50% heads). You toss it 100 times and get 72 heads. Was it just a string of luck, or is the coin biased? Bayesian logic lets us compare two hypotheses. Hypothesis 1: the coin is fair, and the chance of heads on each toss is exactly 1/2. Hypothesis 2: the coin is biased so that heads appear 3/4 of the time. For any outcome — 72 heads, 28 tails — we can calculate the likelihood each hypothesis gives to that exact pattern. For the fair coin, every sequence of heads and tails with the same totals has a probability of (½)^72 × (½)^28 — a tiny number, about 5.6 hundred-thousandths. For the biased coin, it’s (¾)^72 × (¼)^28, which is larger. The ratio of those two likelihoods is the likelihood ratio: in this case, the evidence is roughly 18,000 times more likely if the coin is biased than if it is fair. Bayes’ rule, in its simplest form (called Rule RB), says: after you see the evidence, the odds in favor of the biased hypothesis equal the odds you started with, multiplied by that likelihood ratio. If you began by thinking the coin was just as likely fair as biased, those odds were 1 to 1. After the 72 heads, the odds become about 18,000 to 1 in favor of bias. The evidence has overwhelmingly re-weighted the scales.
Why a Single Test Can Trick You

Medical testing shows the other half of the story: prior probabilities matter just as much as likelihoods. Consider a home COVID test, the kind you might use when you have symptoms. Suppose the test has a sensitivity of 94% (it gives a positive result for 94 out of 100 infected people) and a specificity of 98% (it gives a negative result for 98 out of 100 uninfected people). Those sound pretty good. If you test positive, you might think you are almost certainly sick. But the true answer depends on how common the disease is among people like you — the base rate. Imagine only 5% of symptomatic people in your area actually have COVID at the moment. Bayesian inference, using the odds version called Rule OB, tells us that your odds after a positive result are the prior odds (5% probability means 5:95 odds, or about 1 to 19) times the likelihood ratio of a positive test (0.94 / 0.02 = 47). That gives posterior odds of roughly (1/19)×47 ≈ 2.5 to 1, which translates to a probability of only about 71%. So even with a positive test, there is a 29% chance you do not have the disease — because the prior plausibility was low. If the base rate were higher, say 30% among symptomatic people, the same positive test would push your probability above 95%. The math is the same; the evidence hasn’t changed, but the starting point shifts everything. That’s why a doctor’s hunches about what’s going around can change how they interpret the same lab report.
A Puzzle in the Rocks: When Evidence Wasn’t Enough

Sometimes the evidence seems plain, but it still fails to convince. Take the case of continental drift. As early as the 19th century, someone looking at a map could see that the east coast of South America fits the west coast of Africa like a jigsaw puzzle. Geologists later found that identical rock layers and fossil species — from a freshwater reptile called Mesosaurus to the fern-like Glossopteris — appeared on both continents, dating to exactly the same ancient periods. None of these creatures could have crossed an ocean. If you calculated the likelihood of all this evidence under two competing theories, the drift hypothesis (continents once joined, then moved) would make these observations far more probable than the older contractionist theory (continents fixed in place, Earth just wrinkled as it cooled). Yet for most of the early 20th century, most geologists dismissed drift. Why? Because the prior probability of drift seemed extremely low — no one could imagine a force strong enough to shove whole continents through solid ocean floor. The evidence was strong, but the background belief that continents couldn’t move was even stronger. Once a plausible mechanism (plate tectonics driven by mantle convection) was discovered, the prior plausibility jumped, and the evidence finally had its full persuasive power. The Bayesian framework captures this exactly: the prior odds can overpower a likelihood ratio if the claim is sufficiently extraordinary, or they can be swamped by overwhelmingly strong evidence.
The Formula You Already Use

You do not need a spreadsheet to think like a Bayesian. Every time you hear a surprising story from a friend, you are, in your head, multiplying likelihoods by priors. If a friend tells you they saw a famous singer at the mall, you ask yourself: how likely is it that they’d be at this mall? (That’s a prior.) And how likely is it that my friend would say so if it were true, versus if it were a joke? (That’s the likelihood.) Bayes’ rule doesn’t tell you what to believe — it only tells you how your beliefs should fit together if you want to avoid contradicting yourself. It shows why extraordinary claims require extraordinary evidence: when a hypothesis starts out wildly implausible, even a huge likelihood ratio from a single piece of evidence may leave it still improbable. And it shows why good scientists, like good detectives, try to reduce their background assumptions to a common set — so the evidence can speak as clearly as possible. Next time you’re faced with a claim, be it a weird medical symptom, a political poll, or a questionable statistic, you can ask: What exactly did I observe? How strongly does that connect to the explanation? And how likely did I think it was before? That’s the hidden logic behind every honest attempt to learn about the world.
Think about it
- Think of a time when you changed your mind after seeing new evidence. Was the evidence surprisingly strong, or were you just not very attached to your starting belief?
- If a medical test is 99% accurate and you test positive for a very rare disease (1 in 10,000 people have it), most doctors who hear this problem guess your chance of actually having the disease is high — but it’s only about 0.1%. Why do you think even experts find that so hard?
- Could a detective ever be completely objective, or do their prior biases always affect how they weigh clues? Would knowing the Bayesian formula help them, or just give them a false sense of precision?





