A wireframe cube has no definitive depth. Either of two faces can be the "front" one — and your brain will swap between them every few seconds whether you want it to or not. Use the buttons to lock an interpretation by adding a shading cue. Notice that even after locking, the percept can still flip briefly before settling.
The same contour line defines both the edge of a vase and the profiles of two faces. The boundary belongs to neither interpretation — but you can only hold one at a time. Toggle which region the visual system treats as "figure" and which as "ground."
The same image — the same pixel values, the same retinal activations — produces different conscious experiences. Which means the experience isn't simply a read-out of the image. Something else is doing work: choosing, committing, maintaining a model of what's there.
In predictive coding terms: when two world-models fit the data equally well, the system can't resolve the ambiguity by updating toward one — both have equal evidence. So it samples from them over time rather than converging. The switching rate is roughly 3–5 seconds for most people and is stable within individuals but varies across people, suggesting it reflects something about the dynamics of the inference machinery itself, not just the image.
You can sometimes hold one interpretation intentionally — by focusing on a specific part of the figure, or by imagining the scene in 3D. This is itself interesting: it means top-down attention can bias the prior, which shifts which model "wins." The commitment isn't passive. But it's also not fully under voluntary control. The switching happens anyway, eventually.