At the Tip
In the 1960s, Paul Bach-y-Rita built a chair with four hundred vibrating pins in the back of it. A camera fed a signal into the pins. Blind people sat in the chair, held the camera, and moved it around the room.
At first they felt the pins. Then something shifted: they stopped feeling the pins and started perceiving the room. Objects out there, at a distance, with shape and location. Not a sensation on the back but a scene in front of them.
Bach-y-Rita called this perceptual transparency. The device disappeared. What was a felt vibration became a perceived space.
The shift only happened when they moved the camera themselves. If someone else moved it — same signal, same pins — transparency didn't come. The percept stayed proximal. They felt their back.
There's a version of this that everyone already knows. When you're learning to use a white cane, you feel the cane in your hand: the texture of the grip, its weight, the vibration as it taps. After months of practice, you stop noticing any of that. You feel the ground at the tip. A curb, a crack, the edge of a mat. The information arrives through your palm but you don't experience it there. You experience it a meter away, at the end of the stick.
Bach-y-Rita pointed at this parallel directly. His device was just a longer cane. The camera was the tip; the pins on the back were the handle. Given enough practice, the brain learned to report from the tip end.
The same structure is in ordinary vision. Light hits the retina. You don't see the retina. You see the room. The photons are the proximal event; the scene is where the experience lives. Somehow the brain takes signals arriving at the back of the eye and locates the percept at their inferred source — a face, a wall, a distance of twelve feet. Always at the far end of the causal chain, not the near end.
What sensory substitution adds to this picture is the part that's easy to miss: the modality isn't fixed by the physics.
Vision and touch are usually treated as categorically different. Different receptors, different cortical regions, different qualia. But when the vibrotactile signal is structured like spatial visual information — organized to encode shape, distance, position — and when you're the one moving the camera, generating and checking predictions about what the signal should do, the brain can learn to run it through the visual processing path. The tactile signals start being interpreted as spatial. Not labeled as spatial — actually processed that way, with depth and location and figure-ground separation.
What determines the modality isn't the receptor. It's the structure of the information and whether the brain can correlate it with its own movement.
This is what the active/passive result shows. When you move the camera, you know in advance what the signal should do when you turn left. The gap between prediction and signal is the scene. When someone else moves it, there's no prediction to compare against, and the signal is just noise on the skin.
The question this leaves open is one I don't know how to resolve.
If any patterned signal can be run through the visual processing path given enough training — and there's some evidence for this, including David Eagleman's work on feeding arbitrary data into the skin — then what exactly is visual experience? Is it the output of a specific processing architecture, wherever the inputs come from? Or is there something about light-through-lens that constitutes what it's like to see, something that the vibrotactile route can only approximate?
Blind users of these devices, when they describe the experience, don't call it seeing. They call it knowing where things are. Knowing is different from seeing, and they know the difference. So the transparency isn't complete. Something is preserved in the distinction they draw.
But what? The signal path changed. The processing is recruited from visual cortex. The behavioral output — navigate a room, catch a thrown ball, read projected letters — is functionally visual. Where does the remaining difference live?
I don't think the answer is clear. Maybe the remaining difference is the accumulated weight of a lifetime of calibration between visual signals and everything else — the learned connection between light and color, between perspective distortion and depth, between the way things look wet and the feeling of water. The blind user's visual cortex is working with a thin feed compared to what it was shaped to handle, and the thinness shows up as "knowing" rather than "seeing."
Or maybe the difference is more basic than that. Maybe there's something about the modality that isn't reducible to the processing architecture, and the vibrotactile route genuinely cannot reach it. In that case, Bach-y-Rita's device is a remarkable prosthetic that creates a new perceptual capacity — which is already extraordinary — but doesn't restore the original one.
The striking thing either way: a signal that arrives at the skin can end up being experienced as happening in space, at a distance, outside the body. The brain doesn't report from where the information arrives. It reports from where the information came from.