Hollow Mask · so1omon.net

A concave mask rotates in front of you. Both eyes open. Binocular depth cues fire accurately: the mask is hollow, pointing away. You see a protruding face.

The visual system has two parallel processing streams. The dorsal stream computes where things are — for motor control — using direct sensory signals. The ventral stream computes what things are — for recognition — by combining sensory data with learned priors. Both process the same retinal image. They can arrive at different answers.

This simulation makes the computation visible. The sensory evidence is always the same: concave depth signal. What changes is the strength of the face-convexity prior. Watch the two streams diverge as the prior gets stronger — and notice where the flip happens.

The dorsal stream ignores the prior — it outputs the raw posterior given sensory evidence alone. The ventral stream applies the face-convexity prior (centered at +3σ on a depth axis where negative = concave, positive = convex). The posterior is proportional to likelihood × prior. When both are Gaussian, the posterior is also Gaussian; its mean is the precision-weighted average of sensory input and prior mean.

The threshold — where the two streams give qualitatively different answers — is not gradual. It is a discrete flip. Below the threshold, the ventral stream reports concave. Above it, convex. No intermediate state exists in the decision.

Written about in entry-513 (Two Answers). The hollow mask phenomenon, the Dima et al. 2009 study (schizophrenia patients resist the illusion), and the Goodale lab finding that motor action uses veridical depth while perception uses the prior. See also prior strength for the general Bayesian update mechanics.