New Model Predicts Antibody Structures More Accurately

Researchers at MIT have made substantial advancements in utilizing artificial intelligence (AI), particularly through large language models, to predict the structures of proteins from their amino acid sequences. However, predicting antibody structures has posed unique challenges due to their inherent hypervariability.

To address this issue, the team has crafted a novel computational method that enhances the accuracy of predicting antibody structures with the help of AI. This development has significant implications, enabling researchers to explore millions of potential antibodies in search of candidates that could combat SARS-CoV-2 and other infectious ailments.

“Our innovative approach allows us to scale our findings effectively, enabling us to sift through vast options and identify a few ‘needles in the haystack,’” states Bonnie Berger, the Simons Professor of Mathematics and head of MIT’s Computation and Biology group at the Computer Science and Artificial Intelligence Laboratory (CSAIL). “Our research could prevent pharmaceutical companies from entering clinical trials with unsuitable candidates, thereby saving considerable resources.”

The new technique emphasizes modeling the hypervariable regions of antibodies, offering the capability to analyze the entire antibody repertoires of individuals. This could shed light on the immune responses of “super responders” to pathogens like HIV, helping to understand why their antibodies are particularly effective.

Bryan Bryson, an associate professor of biological engineering at MIT and a senior author on this research, notes that the findings were published this week in the Proceedings of the National Academy of Sciences. The lead authors, including Rohit Singh and Chiho Im ’22, as well as collaborators from Sanofi and ETH Zurich, contributed to this groundbreaking study.

Modeling Antibodies’ Hypervariability

Proteins are made up of long chains of amino acids that can adopt an astonishing array of structures. While artificial intelligence programs like AlphaFold have simplified the prediction of these protein structures, they have struggled with antibodies, particularly in the hypervariable regions where they engage with a variety of antigens. These segments, situated at the tips of the Y-shaped antibody structure, play a crucial role in identifying foreign proteins.

Antibodies contain fewer than 40 amino acids in their hypervariable regions and are estimated to yield around 1 quintillion distinct antibodies, making it challenging for AI models to learn how to predict their structures accurately without evolutionary constraints guiding the sequences.

This research team created two modules enhancing existing protein language models. The first module was trained on hypervariable sequences from approximately 3,000 antibody structures within the Protein Data Bank (PDB) to recognize which sequences yield similar configurations. The second module correlated about 3,700 antibody sequences to their binding strength with three different antigens.

The composite computational model, named AbMap, effectively predicts antibody structures and binding efficacy based on amino acid sequences. To showcase the utility of this model, researchers predicted antibody structures capable of neutralizing the SARS-CoV-2 spike protein, generating millions of variations through alterations in hypervariable regions.

Interestingly, this model was able to pinpoint structures that demonstrated a higher likelihood of success compared to traditional protein-structure prediction methods. Clustering these antibodies by similar structures allowed researchers to select candidates for experimental testing. Collaborating with Sanofi, they found that 82% of the selected antibodies exhibited superior binding strength over the original ones included in the model.

This proactive identification of promising candidates could transform the drug development landscape, minimizing the financial pitfalls associated with trials that ultimately don’t succeed. “Pharmaceutical companies prefer to diversify their candidate selection instead of risking all resources on a single antibody that might fail,” says Singh.

Antibody Comparison and Insights

Employing this advanced modeling technique may also help address critical questions regarding individual immune responses. Why do some people experience severe reactions to Covid-19 while others do not? And why are some who are exposed to HIV never infected?

Traditionally, scientists have investigated these issues through antibody repertoire analysis, which involves single-cell RNA sequencing of immune cells from various individuals. Although past research showed antibody repertoires across individuals could overlap by as little as 10%, sequencing alone fails to provide comprehensive insights into antibody effectiveness since different sequences can yield similar structural functions.

The new AbMap model allows researchers to construct structures for all antibodies present in an individual quickly, demonstrating that structural overlap is much greater than previously observed at the sequence level. Future research will delve deeper into how these structures may influence the immune response against specific pathogens.

This research receives funding from Sanofi and the Abdul Latif Jameel Clinic for Machine Learning in Health.

Photo credit & article inspired by: Massachusetts Institute of Technology

Leave a Reply

Your email address will not be published. Required fields are marked *