The rapid advancement of artificial intelligence (AI) is revolutionizing the landscape of drug discovery. Researchers are increasingly utilizing machine-learning models to sift through billions of molecular candidates to identify those with the desired properties for developing new medications.
However, the path to selection is fraught with complexities. Factors such as material costs and the risk of experimental failure add layers of difficulty, meaning that even with AI’s assistance, navigating the landscape of molecule synthesis remains a daunting task. This intricate balancing act contributes significantly to the lengthy timeline of new medicine development and drives up prescription drug prices.
To address these challenges, a team of researchers at MIT has introduced an innovative algorithmic framework designed to streamline the identification of optimal molecular candidates. This framework optimizes for both minimized synthetic costs and heightened likelihood of achieving desired molecular properties. Furthermore, it outlines the materials and experimental procedures required for the synthesis of chosen molecules.
The quantitative framework, aptly named Synthesis Planning and Rewards-based Route Optimization Workflow (SPARROW), takes a comprehensive approach by accounting for the costs associated with synthesizing multiple molecules simultaneously, as many candidates can be derived from shared chemical compounds.
Moreover, SPARROW consolidates crucial data on molecular design, property prediction, and synthesis planning from diverse online resources as well as popular AI tools. By leveraging SPARROW, pharmaceutical companies can expedite the drug discovery process. Its potential applications also extend to creating new agricultural chemicals and discovering specialized materials for organic electronics.
“Currently, selecting compounds is often more art than science; while there are successes, we can significantly enhance our decision-making process by utilizing predictive models and tools that inform our understanding of molecular performance and synthesis,” explains Connor Coley, the Class of 1957 Career Development Assistant Professor at MIT’s Chemical Engineering and Electrical Engineering and Computer Science departments, and senior author of a paper discussing SPARROW.
Contributing to this study is lead author Jenna Fromer SM ’24. The research was published today in Nature Computational Science.
Understanding Complex Cost Considerations
At its core, deciding whether to synthesize and test a specific molecule hinges on weighing the synthetic costs against the potential value of the experiment. Yet, defining these metrics poses its own set of challenges. For example, an experiment might require costly materials or could be prone to failure. On the flip side, one must consider the utility of understanding the properties of the molecule in question and the uncertainty underlying those predictions.
To increase efficiency, pharmaceutical companies are progressively adopting batch synthesis techniques. Rather than testing each molecule individually, they experiment with various combinations of chemical building blocks in a single run. This method, however, complicates the estimation of costs and values, as it necessitates that chemical reactions adhere to identical experimental conditions.
SPARROW deftly navigates this challenge by considering the shared intermediate compounds necessary for synthesizing multiple molecules, integrating this information into its cost-versus-value analysis.
“Envisioning the optimization process for designing a batch of molecules requires understanding how the cost of adding a new structure is influenced by the existing selections,” Coley states.
SPARROW also evaluates aspects such as the costs associated with starting materials, the number of reactions needed for each synthesis pathway, and the likelihood of achieving successful reactions on the first attempt.
To utilize SPARROW, scientists begin by providing a collection of molecular compounds they are interested in testing along with the desired properties they aim to identify. Subsequently, SPARROW gathers information about the molecules and their synthesis pathways, weighing the value and cost against each other. It then autonomously selects the most promising candidates that align with user criteria while identifying the most cost-effective synthesis methods.
“This optimization process is executed in one step, allowing SPARROW to balance all competing objectives simultaneously,” Fromer explains.
A Versatile Solution
What sets SPARROW apart is its versatility; it can incorporate molecular structures that are human-designed, those found in virtual catalogs, or entirely new molecules generated by AI. “The beauty of SPARROW lies in its ability to level the playing field for diverse ideas,” Coley notes.
To assess SPARROW’s effectiveness, the researchers applied it to three real-world case studies, aimed at solving practical problems faced by chemists. These studies demonstrated SPARROW’s ability to formulate cost-efficient synthesis strategies while accommodating various input molecules, highlighting its capability to scale up for hundreds of candidate molecules.
“In the realm of machine-learning for chemistry, numerous models succeed in retrosynthesis or predicting molecular properties, but translating this into actionable guidance remains a challenge. Our framework is designed to leverage previous work, empowering researchers to consider cost and utility in compound selection,” Fromer remarks.
Looking ahead, the research team aims to enhance SPARROW’s complexity, allowing the algorithm to account for variable values in testing different compounds. Additionally, they plan to incorporate more elements of parallel chemistry within its cost-versus-value framework.
Experts in the field, such as Patrick Riley, senior vice president of artificial intelligence at Relay Therapeutics, provide validation, stating that the SPARROW framework aligns algorithmic decision-making with the realities of chemical synthesis, ultimately leading to better synthesis strategies for medicinal chemists. John Chodera, a computational chemist at Memorial Sloan Kettering Cancer Center, echoes this sentiment, noting that the SPARROW approach automates a crucial aspect of drug discovery, streamlining efforts for medicinal chemistry teams and paving the way for more autonomous solutions.
This pioneering research received support from various organizations, including the DARPA Accelerated Molecular Discovery Program, the Office of Naval Research, and the National Science Foundation.
Photo credit & article inspired by: Massachusetts Institute of Technology