Is Measurement Just a Game We Made Up?

A Viennese Physicist and a Troubling Question

Mach noticed that different liquids expand at different rates—so which one shows the ‘real’ temperature?

Picture a laboratory in Vienna, 1896. A white-bearded physicist named Ernst Mach (1838–1916) stares at two thermometers lying on his workbench. One contains red alcohol, the other silver mercury. Both are placed in the same warm water. But when the alcohol climbs to the 30-degree mark, the mercury reaches only 29.5. Which one is right?

You’ve probably measured things dozens of times — your height, the weight of flour, the temperature outside. We usually trust the numbers. Mach’s two thermometers, though, reveal a crack in that trust. To decide which thermometer shows the “real” temperature, you first need to know what it even means for two temperature intervals to be equal. And that, Mach realized, is not a question nature answers on its own. It’s a choice we humans have to make.

That insight opened a debate that has run for over a century: is measurement a way of reading numbers that already exist in the world, or is it more like writing a rulebook that everyone agrees to follow?

Can You Measure Something You Can’t See?

Fechner believed you could measure the loudness of a sensation by tracking the smallest difference a person can notice.

Even stranger than temperature was the question Gustav Fechner (1801–1887) faced. He wanted to measure something you can’t see or touch: how loud a sound feels to a person. Fechner played two slightly different sounds and asked listeners to say when they noticed a change. He called the smallest detectable difference a just noticeable difference and treated it like a tiny, equal step in the intensity of sensation. By counting steps, he proposed a law linking stimulus strength to the sensation’s strength — Fechner’s law — and claimed that sensations could be measured just like length.

Not everyone agreed. The British physicist Norman Campbell (1880–1949) objected that to measure something fundamentally, you need to be able to pile it up end to end, like placing rulers to make a longer line. You can’t stack loudnesses the way you stack apples on a scale. For Campbell, if you can’t physically combine the thing, you’re not measuring it in a rigorous sense — you’re merely ordering it.

The American psychologist S.S. Stevens (1906–1973) pushed back hard. He defined measurement simply as “assigning numbers according to rules.” If the rules are consistent and help you spot regularities in data, he said, it counts. By that standard, you can measure headache intensity, mood, and yes, loudness. The numbers don’t have to mirror physical addition; they just have to work for the job at hand.

The Rule-Makers: Operationalism and Conventionalism

Bridgman said that the meaning of ‘length’ is nothing more than the operations you use to measure it.

Stevens was borrowing a powerful idea from the physicist Percy Bridgman (1882–1961). In 1927, Bridgman declared that operationalism was the proper way to think about scientific concepts. He argued: “we mean by any concept nothing more than a set of operations; the concept is synonymous with the corresponding set of operations.” For length, that means the concept length just is the act of laying rigid rods end to end. If you measure length with a ruler, that’s one concept — call it length-1. If you time how long an electromagnetic pulse takes to bounce back, that’s length-2. There is no guarantee they’re the same thing.

This view solved some problems but created others. As Carl Hempel (1905–1997) pointed out, operationalism left important scientific terms — like “soluble” — undefined, because you can’t specify a single operation that captures every situation. It also threatened to explode the number of concepts scientists must juggle, which runs against the goal of simple, unified theories.

A less extreme alternative took hold: conventionalism. The mathematician Henri Poincaré (1854–1912) and the logical positivist Hans Reichenbach (1891–1953) argued that measurement always rests on coordinative definitions — statements that link a concept to a measurement procedure, but are neither true nor false. For example, “a measuring rod retains its length when transported” isn’t something you can verify. A mysterious universal force could stretch or shrink everything equally as you move. Reichenbach said we adopt this statement as a rule, not a fact, because it makes physics simple and convenient.

Ernst Mach himself had earlier shown that even the idea of equal temperature intervals depends on such a rule. Alcohol expands at a different rate than mercury; no experiment can tell you which one expands uniformly, because the very meaning of “uniformly” hangs on which liquid you pick as the standard. The choice is conventional, yet it shapes the laws of nature you later discover.

The Realists Strike Back

If different instruments keep agreeing on the same ratio, maybe they’re all latching onto a real property in the world.

What if those rules aren’t arbitrary after all? Realism about measurement says that when you measure something, you are estimating an objective property that exists independently of your instruments and your conventions. The length of a table has a true ratio to a standard meter, whether you measure it or not.

Realists like Brent Mundy and Chris Swoyer (both writing in the 1980s) argued that measurement theory works best if you interpret it as describing real, universal magnitudes — like the property of being 5 meters long — rather than just describing concrete sticks and scales. If length weren’t a real property with a stable, extensive structure (one that mirrors addition), it would be a huge coincidence that so many different ways of measuring length keep giving the same results. Moreover, when scientists talk about “measurement error” or “improving accuracy,” they seem to assume there’s a true value to get closer to. If measurement were pure convention, such talk would be nonsense.

Realists point to the very mathematical theories that early measurement theorists like Hermann von Helmholtz (1821–1894) built. Those theorists showed that certain qualitative structures — ordering things from shorter to longer, or concatenating weights — map beautifully onto the arithmetic of numbers. The realists ask: why should the world cooperate like that, unless the properties themselves have a number-like structure?

They don’t deny that conventions play a role in choosing units or standard instruments. But they insist that once those choices are fixed, the values you obtain answer to something outside your choices. The agreement between a mercury thermometer and an alcohol thermometer is not a miracle; it’s evidence that thermometers track the same underlying quantity — temperature — even if they do it imperfectly.

Measuring the Mind: Psychometrics and Constructs

Psychologists build invisible models to link test answers to hidden abilities—a different kind of measurement.

Today’s psychologists face the same puzzle Fechner did. When a test claims to measure “English comprehension” or “anxiety,” it does not lay anything end to end. Instead, it relies on a model — a simplified, abstract picture of how a hidden construct relates to the concrete answers people give. A classic example is the Rasch model (developed by Georg Rasch in 1960). It uses a mathematical formula to predict the chance a person will answer a question correctly, based on the person’s ability and the item’s difficulty.

This is model-based measurement. You don’t observe ability directly. You observe patterns in questionnaire responses, fit a model to those patterns, and then infer a score. The process of checking that the test actually measures the intended construct is called construct validation. It involves seeing whether different tests that are supposed to measure the same ability behave in similar ways.

Can we be realists about psychological attributes? Some philosophers, like Denny Borsboom, argue yes — an attribute exists and the test is valid if variations in the attribute cause variations in the test results. Others, like Anna Alexandrova, worry that when we measure things like “happiness” or “well-being,” we often avoid thinking hard about what those words mean, and instead fall back on people’s fuzzy everyday beliefs. That can turn validation into an empty ritual.

The debate shows that measurement isn’t just about physics. Every time you receive a grade, a score, or a percentile rank, someone has made modeling choices — and those choices are shaped by values, not just by facts.

Why It Matters Every Time You Check the Weather

The thermometer you trust outside your window is the result of more than a century of choices about what temperature means.

Back to Mach’s two thermometers. By now you can see there’s no magic answer to which liquid is “correct.” Instead, scientists solved the problem through a spiral Hasok Chang calls epistemic iteration. You start with a rough, everyday idea of temperature and a crude instrument. That instrument lets you test some simple theories. Those theories, in turn, help you design a better instrument. Over time, the concept and the measurement procedure refine each other, each loop making the whole system more coherent. The circle isn’t vicious — it’s progressive.

Something similar happened when the kilogram was redefined in 2019. No longer tied to a lump of metal in Paris, it is now defined by fixing the numerical value of the Planck constant. That definition is woven into a web of theory, but it is realized by incredibly precise instruments that can be replicated anywhere. Measurement, in this view, is neither pure reading nor pure invention — it’s a process of tightening the fit between our ideas and the world.

The next time you glance at a thermometer or step on a scale, you’re not just getting a number. You’re tapping into centuries of negotiation between human choices and stubborn reality. That number is both discovered and made.

Think about it

If you were in charge of defining the second, would you base it on the swing of a pendulum or the vibration of an atom? How would your choice affect the experiments that come later?
A teacher grades your essay and gives you an 85. Could a different teacher give you a 90? What does that tell you about whether the grade is a real property of the essay?
Your friend says, “This music is twice as loud as before.” Can you prove them wrong? Why might you still want to try to measure loudness across many listeners?

Email

Is Measurement Just a Game We Made Up?

A Viennese Physicist and a Troubling Question

Can You Measure Something You Can’t See?

The Rule-Makers: Operationalism and Conventionalism

The Realists Strike Back

Measuring the Mind: Psychometrics and Constructs

Why It Matters Every Time You Check the Weather

Think about it

Can You Talk About Something You've Never Seen?

Is Math Just a Giant Game with Rules?

When Should You Change Your Mind? The Math of Belief

A Viennese Physicist and a Troubling Question

Can You Measure Something You Can’t See?

The Rule-Makers: Operationalism and Conventionalism

The Realists Strike Back

Measuring the Mind: Psychometrics and Constructs

Why It Matters Every Time You Check the Weather

Think about it

Keep exploring

Can You Talk About Something You've Never Seen?

Is Math Just a Giant Game with Rules?

When Should You Change Your Mind? The Math of Belief