The human body recombines different genes to make millions of disease-fighting antibodies. This genetic mixing and matching is essential to produce perfectly configured molecules that seek out and destroy dangerous pathogens.
As a result, each of us has around 100 million antibodies circulating through our bodies, poised to fight invaders. This huge antibody diversity is produced by three groups of genes: variable (V), diversity (D) and joining (J). VDJ recombination randomly selects one gene from each group and glues them together, generating the three-part VDJ genes that produce most antibodies.
Understanding how these genes combine to form antibodies could inform new vaccines and other immune therapies. However, key aspects of antibody biology have remained mysterious, such as how ultralong antibodies are formed.
Now, in a paper published in the journal Genome Research, scientists in the Department of Computer Science and Engineering (CSE) at the UC San Diego Jacobs School of Engineering have answered this important question, solving a 30-year-old mystery.
“Standard antibody genes, formed by concatenating (linking) three genes, are critically important to our immune response,” said senior author Pavel Pevzner, Ronald R. Taylor Professor of Computer Science. “However, the body also produces the longer, four-part variety, VDDJ through a tandem fusion of two D genes. Normal recombination signals link D genes with V and J genes but the classical recombination rule forbids linking any two D genes. So, how do these four-part antibodies even form?”
First discovered in 1989, four-part antibody genes were initially viewed as harmless, and possibly useless, aberrations. However, in the last ten years, researchers have shown ultralong antibodies, including D-D fusions, can deeply target HIV surface proteins, generating broadly neutralizing antibodies against the virus.
As a result, understanding VDDJ formation has become an important priority, part of an overarching effort to develop effective vaccines. But finding tandem D-D fusions in huge immuno-sequencing datasets has been a major computational challenge, as researchers must sift through multiple mutations to produce accurate results.
To solve the problem, Pevzner and coauthor Yana Safonova, a postdoctoral researcher at CSE, relied on a recently developed computational tool, called IgScout, to find D-D fusions. Using this approach, they identified a distinct antibody formation process: short genomic segments called cryptic recombination signals that flank D genes and trigger tandem D-D fusions. The cryptic recombination make D genes look like V or J genes, enabling tandem D-D fusions.
The study also showed these ultralong antibodies are much more prevalent than previously thought and have helped mammals fight disease for millions of years.
“Contrary to the previous assumption that tandem D-D fusions are rare events, our analysis shows about a quarter of ultralong antibodies, in disease-relevant antibody repertoires, may be generated through tandem fusions,” said Safonova. “These cryptic recombination signals are not limited to humans but are preserved over millions of years of evolution across multiple mammalian species. In other words, tandem D-D fusions are an important mechanism to generate ultralong antibodies.”
Now that scientists understand how these antibodies form, they can use that knowledge to advance vaccine research.
“This paper solves the three-decade-old puzzle of aberrant tandem fusions in antibody formation,” said Pevzner. “These fusions contribute to broadly neutralizing antibodies that are important for ongoing vaccine development efforts.”