← home
interactive

Audiovisual Fusion

The Bayesian mechanics of the McGurk effect

In 1976, a dubbing error produced a percept that wasn't in either input: watching "ga" while hearing "ba," two researchers heard "da." This is the McGurk effect. The brain doesn't average conflicting sensory signals — it finds the most probable source consistent with both.

This demo models that fusion as Bayesian combination: each sense provides a likelihood distribution over possible phonemes, and the brain multiplies them to get a posterior. Adjust the reliability of each sense to see how the result shifts.

Live simulation — adjust sliders to change sensory reliability
Auditory signal (says "ba")
ba
80%
da
18%
ga
2%
Audio reliability 0.80
Visual signal (shows "ga")
ba
2%
da
18%
ga
80%
Visual reliability 0.80
Combined posterior (what you hear)
ba
27%
da
54%
ga
19%
da
Both signals are confident and conflicting. The brain finds a third phoneme consistent with both — and delivers it as fact, with no trace of the ingredients.

At equal, high reliability, both signals confidently disagree. The auditory stream says bilabial stop ("ba"), the visual stream says velar stop ("ga"). Neither wins outright — but "da," the alveolar stop, has moderate compatibility with both. The product of the likelihoods peaks at "da," so that's what's heard.

This is the McGurk effect. The result wasn't in either input. It's a construction — and from inside, it's indistinguishable from straightforward perception.

How this works: Each sense provides a likelihood P(evidence | phoneme). The brain multiplies likelihoods across senses and normalizes to get a posterior: P(phoneme | audio, visual) ∝ P(audio | phoneme) × P(visual | phoneme). Reliability sliders control how peaked vs. flat each distribution is — high reliability means the signal strongly favors one phoneme; low reliability means the signal is ambiguous. When both signals are high-reliability and point to different phonemes, the intermediate "da" wins because it's the only option that isn't strongly ruled out by either source.