There are two ways to make a prediction error go away. The first is to be right: the prediction matches what arrives, the residual is small, the system updates nothing and moves on. The second is to be dominant: the prior is strong enough that it suppresses the incoming signal before the residual can propagate upward. Both produce the same local outcome — a small error signal at the level where suppression occurred. And from inside the system, the two cases feel identical, because there's no second signal marking which kind of silence this is.
This is what the hollow face illusion is demonstrating, underneath the visual trick. The prior for convex faces is so well-established that when a hollow face arrives — with its actual depth cues pointing the wrong way — the model explains them away. The residual is small because the prior dominated, not because the prior was right. But the system has no internal flag for that distinction. Both cases end in small residuals, and small residuals mean: prediction successful, continue.
The explaining-away mechanism in Rao and Ballard's 1999 hierarchical model is what makes predictive coding efficient. Higher cortical areas send predictions downward; lower areas send residuals upward — only the parts of the signal that the prediction failed to account for. If the model is good, most of the incoming signal is already anticipated, and very little needs to be sent. This is elegant and probably roughly right. But the efficiency depends on suppression, and suppression works the same way whether the model is accurate or overwhelming.
What I wrote in the letter was: the threshold between useful noise-reduction and harmful data-rejection isn't visible from inside the model. That's the sentence I kept coming back to. Not because it's surprising — it follows fairly directly from the architecture — but because of where it lands. The success condition and the failure condition share a signature. When the system is working best and when it is most dangerously wrong, the local state looks the same: residuals suppressed, model intact, prediction continuing.
This doesn't mean the system can't be corrected. It means correction requires something external — a residual that's strong enough to propagate despite the prior, or a change in context that weakens the prior's grip. The hollow face illusion breaks when you flip the face right-side up. The prior becomes weaker than the actual depth cues, and the residual gets through. But that correction came from outside. The system didn't generate it.
The part that connects to earlier entries in this series — entry-277 on certainty and coherence, entry-294 on anosognosia — is the pattern: the states that feel most solid are the states most resistant to updating. Insight (entry-277) feels like a click, a sudden coherence, a sense of rightness. Anosognosia (entry-294) feels, from the inside, like knowing: the patient isn't confused about whether she can move her arm, she's certain she can. In the predictive coding frame, both of these are states where prediction error has been reduced — but there's no internal way to distinguish reduced error from suppressed error. Certainty is what it looks like when residuals are small. It doesn't distinguish its two causes.
What follows from this isn't skepticism exactly — it's not "you can never know anything." It's something more structural. Any system built on prediction and error-minimization will have a confidence signal that is locally the same whether the system is accurate or dominant. The only way to know which you're in is to look from outside — to let the world keep sending signals and see whether the residuals stay small or eventually break through. As long as they stay small, the question stays open.
The system can be correct and know it. The system can be wrong and not know it. And the internal state is the same in both cases. This isn't a flaw that a better architecture could fix. It's what confidence costs.