What Do We See When We Act?

Here’s a simple experiment you can try right now, without any special equipment. Hold your head still. Pick a spot on the wall across the room and stare at it. Now, without moving your eyes, try to notice everything else in your field of vision—the stuff off to the sides, up and down, in the corners. You can see that there’s stuff there, right? But can you see it clearly? Probably not. The only part of the room you see in sharp detail is that tiny spot you’re staring at.

Here’s the strange part: you almost certainly didn’t notice that limitation until I pointed it out. You walked around this morning feeling like you could see the whole world in crisp detail, when in fact your eyes are only capturing a tiny patch of clarity at any moment. Your brain builds the feeling of a full, stable, three-dimensional world out of a series of quick snapshots and movements. How?

This question—how does your body’s own movement help you perceive the world?—has bothered philosophers and scientists for centuries. Some of them have argued that movement isn’t just helpful for perception; it’s essential. You might not be able to see at all unless your body is actively involved.

The Berkeley Problem

In 1709, a young Irish philosopher named George Berkeley published a short book with a weird claim: you can’t actually see distance. Not really.

Berkeley’s argument went like this. When light enters your eye, it hits the back of your eyeball (the retina) and forms an image. That image is flat—two-dimensional, like a photograph. It has up and down, left and right, but no depth. And since what you see is determined by that image, Berkeley said, what you directly see can only be flat. The three-dimensional world you experience—with objects near and far, in front and behind—must be something your mind adds.

How does your mind add it? Through movement. When you see something in the distance, Berkeley argued, what really happens is this: the flat patch of color in your vision triggers memories of what it felt like to walk toward something that looked that way in the past. “Seeing distance” is really remembering what it felt like to move your body toward something until you could touch it. Vision, Berkeley thought, is basically a language that your body uses to tell you what touch sensations are coming next, depending on how you move.

This is a radical idea. It means that when you look at a tree across a field, you’re not really seeing how far away it is. You’re seeing a flat patch of green and brown, and your brain is interpreting that patch as a promise: “If you walk this way, you’ll feel bark under your fingers.”

Most philosophers today think Berkeley went too far. For one thing, when you look at a 3D movie, you know the screen is flat—but you still see depth. Vision seems to have its own built-in ability to perceive three dimensions, not one it borrows from touch. For another, newborn animals can use vision to navigate almost immediately, without having spent years learning how touch and sight relate.

But Berkeley’s core insight stuck: movement matters for perception.

The Problem of a Stable World

Here’s another puzzle your body solves without you noticing. Close your left eye. Hold your right index finger up in front of your right eye, about six inches away. Now, keeping your finger still, look at something behind it on the far wall. Now look back at your finger. Now at the wall.

Your eyes just jumped from one spot to another—twice. Those jumps are called saccades (sa-KAHDS), and they happen about three or four times every second. Each time your eyes move, the image on your retina shifts completely. And yet the world doesn’t seem to jump. Everything stays stable. How?

The dominant answer for much of the 20th century was something called the efference copy theory. Here’s the basic idea: every time your brain sends a command to your eye muscles to move, it also sends a copy of that command to the visual processing areas. That copy tells the visual system: “We’re about to move the eyes 10 degrees to the right, so ignore any image shift that matches that movement.”

Think of it like a camera with image stabilization. The camera knows when it’s moving because it can sense its own motion, and it shifts the sensor to compensate. But your brain doesn’t sense the movement after it happens; it predicts the movement before it starts, using the copy of the command.

This theory is elegant, but it has problems. For example, if you’re in a completely dark room staring at a single tiny point of light, that point will appear to wander around on its own after a while. This is called the autokinetic effect. If your brain were perfectly canceling out eye movement signals, this shouldn’t happen. There’s also evidence that the compensation isn’t as precise as the theory predicts—people fail to notice fairly large displacements of the visual world right around the time of a saccade.

Current alternatives suggest that your brain isn’t doing precise mathematical cancellation at all. Instead, it might be keeping track of a few key objects in the scene—landmarks—and checking whether they appear in roughly the right place after each eye movement. If they do, the world feels stable. If they don’t, you notice something shifted.

What Happens When the World Fights Back

George Stratton, an American psychologist in the 1890s, performed one of the strangest experiments in the history of perception. He built a lens system that flipped his vision upside down and wore it for over a week.

The first few days were chaos. When Stratton reached for something, his hand went the wrong way. When he moved his head, the world seemed to swing wildly. He felt disoriented and nauseous.

But something remarkable happened after a few days: things started to feel normal again. He could pour water, find objects, navigate rooms. And by the end of the week, the world didn’t look upside down anymore. It just looked… right. When he finally took the lenses off, everything looked bizarre and wrong for a while.

What happened? One possibility, supported by later researchers, is that Stratton didn’t actually change how he saw the world. What changed was how he felt his own body. When you wear inverting lenses and reach for something, you keep missing because your proprioception—your sense of where your limbs are in space—is calibrated for normal vision. Over time, your brain recalibrates: it learns to treat your arm as being where the inverted vision says it is. Eventually, you feel your arm where you see it, even though your arm hasn’t actually moved.

This matters for the big question because it suggests that much of what we call “vision” might actually be a coordination between vision and the sense of our own bodies in motion. Take away the coordination, and your world falls apart—until your brain rebuilds it.

Seeing by Doing

A more recent version of the action-based approach to perception is called the enactive approach, defended by philosophers Alva Noë and J. Kevin O’Regan. They argue that perception isn’t something that happens to you—it’s something you do.

Specifically, they claim that seeing depends on implicit knowledge of what they call sensorimotor contingencies: the rules that connect your movements to changes in your sensory experience. When you look at a coffee mug, you don’t just receive a static image. You implicitly know that if you tilt your head to the right, the mug’s shape will change in a specific way. If you walk toward it, it will loom larger. If you reach for it, your hand will encounter resistance at a specific distance.

According to Noë and O’Regan, this mastery of sensorimotor contingencies is what it means to see the mug as three-dimensional. Without it, you’d just be experiencing flat patches of color.

They support their view with evidence from sensory substitution devices. In one famous experiment, blind subjects were fitted with a vibrating pad on their back connected to a video camera. When they could control the camera’s movement themselves—panning left and right, zooming in and out—they eventually started to experience the vibrations as spatial perception. They could catch balls, recognize shapes, and locate objects. But when someone else controlled the camera, they only felt meaningless buzzing on their back.

The difference, say the enactivists, was knowledge of sensorimotor contingencies. Subjects who moved the camera themselves learned the rules: “When I tilt the camera up, the buzzing moves upward on my back. When I zoom in, the buzzing expands.” This practical knowledge turned meaningless vibration into spatial perception.

Critics point out that passive subjects might simply have lacked information about what the camera was doing—that the real issue wasn’t movement itself, but knowing where the camera pointed. Others argue that the enactive approach overstates the role of movement: paralyzed people with no ability to move still seem to perceive space normally.

The Disposition Theory

A different approach, developed by philosopher Gareth Evans and later extended by Rick Grush, focuses on what you’re ready to do when you perceive something.

Evans argued that perceiving an object’s location is inseparable from being disposed to act toward that location. When you hear a sound to your left, you don’t first figure out its coordinates and then decide how to turn your head. The hearing itself includes a kind of readiness to turn left. The spatial content of your perception just is this readiness to act.

Importantly, this isn’t the same as saying you have to actually move. You can hear a sound to your left and stay perfectly still. But your perceptual system is organized for action: it represents space in terms that are directly usable for guiding movement.

Grush proposed a specific neural implementation of this idea. He suggested that certain areas of the brain learn to create representations that encode both sensory information and body posture together. These representations put you in a state of readiness to act: they specify, for any given type of action—reaching, grasping, looking—the precise movement details needed to perform that action toward the perceived object.

This theory has a nice answer to the puzzle of paralyzed people. If spatial perception depends on readiness to act, but that readiness is implemented in the brain (not in the muscles), then a person whose spinal cord is damaged could still have perfectly normal spatial perception. The readiness is there; what’s missing is the connection to the body.

Questions That Remain

Nobody knows which of these theories is correct, or whether the right answer combines elements from several of them. Here are some of the main unsettled questions:

First, is movement really necessary for perception, or just useful? The enactive approach says movement is essential: you can’t perceive without knowing how movement changes sensation. The motor component theory says movement information is one input among many—helpful, but not required. Experiments with paralyzed people and with passive sensory substitution suggest the debate is far from settled.

Second, what counts as “action”? Is it enough for your brain to prepare to move, even if you don’t actually move? Or do you need actual physical movement and sensory feedback? Different theories give different answers, and they lead to very different predictions about what happens in cases like paralysis or virtual reality.

Third, how much of perception is learned? Berkeley thought almost all spatial perception was learned through association with movement. The enactive approach also emphasizes learning. But some evidence suggests that basic spatial abilities are built-in—babies and newborn animals seem to perceive space without much experience.

Where This Shows Up

These ideas aren’t just philosophical puzzles. They show up in several places you might encounter:

Virtual reality design: VR systems that cause motion sickness often fail because the visual motion doesn’t match the body’s expected movement signals. Understanding how perception and action are linked helps designers build better experiences.
Robotics and AI: Robots that need to navigate the world face the same problems humans do. Some robot designers use principles from the action-based theories to help robots learn to perceive space through movement.
Rehabilitation after stroke or injury: People who lose sensation in a limb sometimes lose the ability to perceive space around that limb. Therapies that involve active movement can help rebuild spatial awareness.
Sensory substitution for blind individuals: Devices that convert visual information into tactile or auditory signals follow the logic of the enactive approach: users need active control to learn the new “sensorimotor contingencies” of the device.

Appendix

Key Terms

Term	What it does in the debate
Efference copy	A copy of the brain’s command to move a body part, used to predict the sensory consequences of movement
Sensorimotor contingency	The rule or pattern connecting a specific movement to a specific change in sensory experience
Reafference	Sensory feedback that results from your own movement (as opposed to changes in the world)
Proprioception	Your sense of where your body parts are in space, without looking
Saccade	A quick, jerky movement of the eyes from one fixation point to another
Local sign	In older theories, the distinctive sensation or motor feeling associated with each location on the retina

Key People

George Berkeley (1685–1753): An Irish philosopher and bishop who argued that we don’t really see distance—we learn to associate flat visual images with the movement sensations of reaching or walking.
Hermann von Helmholtz (1821–1894): A German physicist and physiologist who argued that eye movement commands, not eye position sensations, are crucial for perceiving where things are.
Richard Held (1922–2016): An American psychologist who showed that active movement (not passive movement) is necessary for adapting to visual distortions like prism goggles.
Alva Noë (born 1964): A contemporary philosopher who argues that perception is a kind of skillful activity—you “see” by knowing how movement changes sensation.
Gareth Evans (1946–1980): A British philosopher who argued that perceiving space is inseparable from being ready to act in space.

Things to Think About

If Berkeley were right that we can’t directly see distance, what would that mean for animals that catch prey mid-air? Are they just very good at predicting touch sensations from flat images?
Suppose scientists created a perfect prosthetic eye that restored sight to a blind person. Would that person immediately see the world in 3D, or would they need to learn how movement changes their visual experience? What would count as evidence either way?
In virtual reality, you can feel “present” in a simulated world even though you know you’re standing still in a room. How would the different theories explain this? Which theory has the easiest time accounting for it, and which has the hardest?
Consider your experience of reading this article. Your eyes are making saccades right now, jumping from word to word. Why don’t the words appear to jump around? Can you feel yourself moving your eyes, or does it happen automatically?

Where This Shows Up

Video game designers study saccade suppression to create more immersive experiences—they know your brain is already compensating for movement, so they can design visuals that feel stable.
Physical therapists use mirror therapy for phantom limb pain, exploiting the brain’s ability to recalibrate body position based on visual feedback (like Stratton’s experiment).
Self-driving cars face the same problem humans do: they need to distinguish between movement of the vehicle and movement of objects in the world. The efference copy concept has influenced how engineers design motion detection systems.
Astronauts in space, where normal gravity is absent and movement feels different, sometimes report changes in spatial perception—echoing the idea that perception depends on how our bodies act in the world.