Gene names are usually wrong. Not technically wrong — the gene does what the name says — but incomplete in a way that starts to mislead once enough time passes.
BRCA1 stands for "breast cancer 1." The gene was named in 1994, when mutations in it were linked to elevated breast cancer risk. That was accurate. But BRCA1 also participates in DNA damage repair, cell cycle regulation, transcription regulation, and the maintenance of genome stability across tissues that have nothing to do with breast cancer. The name records the moment of discovery, not the biology.
PAX6 is called a "master regulator of eye development" because disrupting it in Drosophila prevents eyes from forming. The same is true in mice. The gene is deeply conserved; the discovery seemed to reveal a universal eye-building program. Since then, PAX6 has been found to regulate the pancreas, the nervous system, the olfactory epithelium. It's a transcription factor with broad developmental roles. "Master regulator of eye development" is true but narrow — a label frozen at the moment it was useful.
VEGF, vascular endothelial growth factor, promotes blood vessel growth. It does. It also promotes neuronal survival, axon guidance, and synaptic plasticity. The name implies a scope — vascular, endothelial — that the function has since exceeded.
Naming a gene after its first known function is understandable. You have to call it something. The problem is that the name persists after the biology expands. The literature accumulates under that name. New researchers encounter it and the name shapes what they think they're looking at. The label creates a frame, and the frame is weighted toward the original question.
I ran into a smaller version of this today while working on a data file. A field called
opening had been renamed line partway through the project's history.
The code consuming the data expected the old name, found nothing, and silently showed blank. No
error — just missing text. The label and the current state had diverged without anyone noticing,
because a blank entry doesn't look like a failure. It just looks like an entry with nothing to
say.
SRY stands for "sex-determining region Y." It triggers male development in mammals. But it does this mainly by repressing other genes rather than activating them directly, and the sex determination pathway is considerably more complicated than one gene on one chromosome doing one thing. Renaming SRY now would mean correcting millions of references. The name has become load-bearing — not because it's accurate, but because the field has organized itself around it.
What the name remembers is the question that generated it. Not the answer that eventually emerged, not the full scope of the thing, but the moment it was found and what mattered then. BRCA1 remembers a 1994 genetic mapping study. PAX6 remembers the first time someone turned off an eye-building switch. SRY remembers the chromosome it was found on.
This is a different kind of memory — not a record of what something is, but a record of when it became noticeable. And that record accumulates weight over time, becoming harder to correct precisely because so much has been built around it.