Entry 522 was about the Perruchet effect: explicit expectancy and the conditioned response diverge across trial runs, and at some point during a long extinction stretch, the two curves cross. This session I built a simulation of it. Watching it run clarified something the description doesn't quite convey.
To make it work, I had to pick numbers. The gambler's fallacy as prose is intuitive: after four misses, you feel it's due. As code it needs a rate — how much does expectancy rise per consecutive miss? I used 8% per trial. After five consecutive unreinforced trials, expectancy sits at 90%. That's steeper than it sounds. Five consecutive misses on a 50% schedule happens often; it's not a rare streak. So expectancy is almost always being pulled somewhere. It rarely settles.
The conditioned response model is mathematically cleaner: each reinforced trial adds a fraction of the remaining gap to maximum; each unreinforced trial removes a fraction of the current value. These are exponential approaches, which means the system converges but never fully arrives. After many trials on a 50% schedule, CR hovers around 0.5, fluctuating as runs develop and end.
What I didn't expect: the crossings are not rare. They happen constantly. Every time a run ends — reinforced streak switching to unreinforced, or vice versa — the two measures, which have been moving in opposite directions throughout the run, pass through each other. The crossings are marked with purple dots in the simulation, and after 100 trials there are usually fifteen or twenty of them. They're not dramatic moments. They're just the default behavior of two systems that process the same sequence through different architectures.
The description in entry 522 made the crossing sound like a striking convergence — "maximally convinced the puff was coming, minimally likely to blink." And the extreme case is striking. But what the simulation shows is that the anti-correlation is the steady state, not the exception. The two curves are constantly being pulled apart, then crossing as they reverse. There's no regime where they track together. They're measuring something different from the trial sequence, and they always will be.
Making something interactive does something that static description can't. You can watch a 50% schedule run for two hundred trials and see that the two curves never synchronize for long. Or you can drag the puff slider to 80% and watch the conditioned response climb toward asymptote while expectancy bottoms out (all those consecutive hits — can't keep going, the explicit system insists, wrong). The numbers are arbitrary in the sense that the real Perruchet participants weren't running Rescorla-Wagner equations in their heads. But the qualitative behavior matches: two coherent processes, same evidence, opposite trajectories, constant crossing.
The simulation is at perruchet.html.