Who Belongs Where on the Family Tree of Life?

The Fish That Confused Everyone

For decades, scientists fought over whether the coelacanth or the lungfish was our closest fish relative.

In 2013, a team of biologists announced they had finally solved a fight that had been simmering for nearly a century. The question: which odd-looking fish is the closest relative of the very first animals to crawl onto land? Was it the coelacanth, a deep-sea creature once thought to have gone extinct with the dinosaurs? Or was it the lungfish, which has fleshy fins and actual lungs? To lay the squabble to rest, the scientists built a huge family tree — and the answer they got reshuffled the branches that lead straight to us.

That kind of family tree is called a phylogeny. A phylogeny is a diagram of evolutionary history: it shows how different groups of living things are related through time. The whole project of making such trees is phylogenetic inference — figuring out what the tree should look like from the clues that nature left behind. In the coelacanth-versus-lungfish case, the tree told a clear story: the lungfish is the closest fishy cousin to frogs, lizards, and mammals, including you.

Reading a phylogeny is like tracing a winding road backward. Each line, or branch, stands for a lineage of ancestors. Where branches split, a node marks the common ancestor of all the groups that sprout from it. If you start at the right side of the diagram and move left, you are traveling back in time. The groups that share a more recent node are more closely related. Lungfish and frogs meet at a node that is closer to today than the node where either meets a pufferfish, so lungfish and frogs are nearer kin.

What Makes a Real Family Group?

Phylogenetic trees are not just pretty pictures; they carry hard rules about what counts as a real natural group. Biologists talk about monophyletic groups, also called clades. A clade is an ancestor plus every single one of its descendants. The mammals form a clade because all mammals — from platypuses to whales — share a unique common ancestor that no bird or lizard can claim. If you tried to group sharks together with pufferfish and leave out frogs and mammals, you would get something called a paraphyletic group: a clan with a common ancestor but with some descendants unfairly kicked out. That kind of group does not hang together on a single branch of the tree, so it cannot do the explanatory work that clades can.

How do we figure out which groups are clades? By looking at homologies. A homology is a trait that two species share because they inherited it from the same ancestor. The mammary glands of a human and an elephant are homologous; both trace back to the earliest mammals. Homologies form a nested pattern: all mammals have hair and produce milk, but a smaller group inside the mammals — the placental mammals — also share a special womb structure. Frogs lack an amniotic sac, but lizards, birds, and mammals all have one, revealing that the amniotic sac evolved after the frog branch parted ways. This nesting of traits allowed scientists long before Darwin to group organisms into a natural hierarchy. Darwin himself argued that the nested hierarchy of life was the single best piece of evidence for common ancestry.

But there is a twist. You need a phylogeny to test whether a trait is truly a homology, yet you need homologies to build the phylogeny in the first place. Biologists call this a “chicken-and-egg” puzzle, or reciprocal illumination: your preliminary ideas about the tree get tested and corrected as new traits come to light, and those corrected trees in turn force you to recheck which traits really are the same by descent. Some critics have called this circular, but defenders reply that it is exactly how scientific tests work — every observation refines the next round of questions.

The Clash of the Tree-Builders

Two ways to find the best tree: cut out extra steps, or calculate which tree makes the data most probable.

If you give the same set of genetic data to two different tree-building programs, they may hand you back two different trees. Why? Because phylogeneticists disagree deeply about which method of inference is best. The loudest fight pits parsimony against model-based statistical methods.

Parsimony is a principle of simplicity: the tree that requires the fewest evolutionary changes to explain the data is the one you should choose. Early defenders of parsimony, such as the biologist Willi Hennig (1913–1976), argued that the method is scientific because it can be severely tested. Each new character you examine is a potential shot at falsifying the current tree; the tree that survives the most shots is the most trustworthy. Some parsimony enthusiasts wrapped themselves in the philosophy of Karl Popper, claiming that only trees built with the fewest assumptions about how evolution works are genuinely testable. They called themselves “cladists” and insisted that you must never smuggle in a model of how fast or how randomly DNA changes.

On the other side sit the statistical phylogeneticists, who treat phylogeny as a problem for probability. They point out that DNA sequences are data that can be modeled mathematically. In a typical likelihood method, you pick a model of how nucleotides mutate — for example, the simple Jukes-Cantor model assumes any letter in the genetic code can flip to any other with equal odds. Then you compute the probability of seeing the actual sequences you collected, given a particular tree. The tree that makes your data most likely is the winner. A related Bayesian approach adds prior beliefs about how plausible different trees are before you see the new evidence.

The statistical camp has a sharp weapon: long-branch attraction. Joseph Felsenstein (1942–) showed in 1978 that parsimony can be systematically fooled. Imagine two branches on the tree that both evolved unusually fast. By chance alone, they will rack up matching mutations that are not due to shared ancestry. Parsimony, which loves a short explanation, will often group those two fast-evolving branches together as close sisters — even when the true history says they are only distant cousins. That means parsimony is statistically inconsistent: adding more and more data does not guarantee getting closer to the true tree; in some cases, it pushes you further away. Statisticians often treat statistical consistency as a make-or-break property for any method. Likelihood methods, when the model is correctly chosen, are provably consistent.

Not everyone accepts that consistency is the ultimate judge, though. The philosopher Elliott Sober has argued that likelihood itself is a basic standard of evidence — if a likelihood inference ever happened to be inconsistent, that would not automatically sink it. And some cladists reject the whole idea that evolution can be squeezed into a probability model at all. They insist that tree-building should be logic-based, not model-based. The debate rumbles on, and different researchers vote with their keyboards daily.

When Genes Tell Different Stories

In the microbial world, genetic material can jump sideways, turning the tree of life into something more like a web.

So far, we have treated a phylogeny as a clean, branching diagram where one lineage splits into two, never to rejoin. That picture is an idealization. Real life is messier. Genes do not always travel obediently from parent to offspring; sometimes they skip sideways. Horizontal gene transfer, a trick especially common among bacteria, lets one microbe pass a chunk of DNA to an unrelated neighbor. In plants, hybridization can cause whole genomes to fuse. Even within the same species, different genes can have slightly different histories — a phenomenon called genealogical discordance.

A major cause of discordance is incomplete lineage sorting. Imagine two copies of a gene inside a population. If those gene copies do not find a common ancestor before the population itself splits into two new species, the gene tree may end up telling a different story than the species tree. This is like two cousins in a big family who happen to share a rare eye color that no one else in their generation has — they look like close siblings when you focus only on that trait, but the overall family history says otherwise.

For a long time, biologists treated such conflicting gene signals as noise that needed to be filtered out. Now, many see them as valuable evidence. New methods like the multispecies coalescent explicitly model the fact that lineages inside a species can “come apart” in time. That lets researchers extract information from discordance rather than sweeping it under the rug. It has also forced some hard philosophical questions: if genes can have separate histories, what exactly is a species tree? Is it the story of how most genes agree, or the story of how reproductive barriers arose? The answers shape what we even mean by a phylogeny.

Why the Tree of Life Still Matters

Tracing the tree of life is not just about the distant past — it helps us understand our own place among living things.

The squabble over coelacanths and lungfish might sound like a narrow museum debate, but it ripples outward. Phylogenies are the backbone of modern biology. When researchers tracked the COVID-19 pandemic, they used phylogenetic methods to map how the virus jumped between people and evolved across borders. Conservation biologists rely on species trees to decide which habitats to protect. Even understanding why some flowers attract bees and others moths depends on getting the evolutionary relationships right.

At bottom, phylogenetic inference is about how we come to know anything about the deep past. It is a live laboratory for philosophy of science: what counts as a good test? When is a simpler explanation better than a more complex one? Should we favor methods that are guaranteed to work in the long run, or those that best respect the evidence we have right now? These are not dusty museum questions; they are the same questions you face when you try to reconstruct any story from scattered clues — whether you are solving a mystery in a video game, piecing together your own family history, or debating which dinosaur was the true ancestor of birds.

Think about it

If two different methods give you two different family trees, and you have no way to go back in time to check, how would you decide which tree to trust?
Imagine you discover that a gene for a particular trait appears in two very distantly related species. Could that discovery count as evidence against a simple branching tree, or is there always a way to explain it without changing the story?
Scientists sometimes treat conflicting gene histories as “noise” to ignore. Can you think of a situation where paying attention to the messy conflicts might actually be more honest — even if it makes the picture blurrier?

Email

Who Belongs Where on the Family Tree of Life?

The Fish That Confused Everyone

What Makes a Real Family Group?

The Clash of the Tree-Builders

When Genes Tell Different Stories

Why the Tree of Life Still Matters

Think about it

Why a Single Drop of Water Shakes Up All of Philosophy

What Exactly Is a Species? Biologists Can't Agree

Evolution's Biggest Puzzle: Who's Really Winning, Genes or Groups?

The Fish That Confused Everyone

What Makes a Real Family Group?

The Clash of the Tree-Builders

When Genes Tell Different Stories

Why the Tree of Life Still Matters

Think about it

Keep exploring

Why a Single Drop of Water Shakes Up All of Philosophy

What Exactly Is a Species? Biologists Can't Agree

Evolution's Biggest Puzzle: Who's Really Winning, Genes or Groups?