Can Retrosynthesis Prediction Eliminate Frustration Between Design and Synthesis?

Published on: February 21, 2024

How do you think of computer-based prediction of retrosynthesis? Retrosynthesis is a key in synthesis of any molecule without established synthetic routes. When you design a molecule, it is necessary to try retrosynthetic analysis so as to propose a synthetically accessible one. It is frustrating if the designed molecule with credible bioactivity is not accessible through the analysis. We would like to discuss the possibility of deep learning to eliminate the frustration between molecule design and synthesis by retrosynthesis prediction.

The concept of retrosynthetic analysis was proposed by Elias James Corey and he was awarded the Nobel Prize for chemistry in 1990. His theory was summarized in review in 1988.¹⁾ His original papers²⁾ and book³⁾ are still a bible of retrosynthetic analysis. It is, in fact, comprised of a number of elements. In pharmaceutical science, at least these elements are necessary to take into consideration.

1. Overall synthetic strategy
2. Design of a series of chemical reactions
3. Evaluation of the feasibility of each reaction step
4. Prediction of byproducts and ease of separability
5. Estimated reaction time and conditions
6. Efficiency in synthesis and cost
7. Scalability and Safety

The retrosynthetic analysis has depended and will depend on sophisticated synthetic chemists. Trained chemist has been essential to figure out a reasonable and beautiful design of synthetic route toward a target molecule. But with the advent of computational approaches, retrosynthetic analysis tools have already been available in SciFinder and Reaxys. However, these tools are still developing for more reasonable and efficient retrosynthetic design.

Intensive research on retrosynthesis prediction is ongoing by the use of GNN (Graph Neural Network). A graph transformer-based approaches is feasible because graph structures are commonly in use in this field and they have been applied for long to describe chemical structures.

It was first posted for reaction center prediction and synthon completion in forward reaction prediction.⁴⁾,⁵⁾ In addition, GNN-based algorism are expanding, exemplified by R-GCN (relational graph convolution network),⁶⁾ GAT (Graph Attention Network),⁷⁾ and MPNNs (message passing neural networks).⁸⁾

Let us take a brief look at a recent retrosynthesis predictors using GNN. LocalRetro take advantage of GNN with a new concept of local reactivity by combination with global attention.⁹⁾ The authors tried to mimic the intuition of synthetic chemist by two steps.
First, local reactivity of chemical structures was taken into templates. Most chemist consider the molecule’s functional groups or characteristic structures to determine the most reasonable bond to cleave in retrosynthetic analysis.

Secondly, features of the whole molecule are taken into account, like the possibility of neighboring group participation, reactivity tendency between local templates. They used a novel algorism, graph reactivity attention (GRA) to achieve this global attention prediction.
LocalRetro was trained and evaluated USPTO-MIT dataset and yielded 97.4% top-5 round-trip accuracy in terms of the combining high probability of reaction to proceed in an intended way.

Retrosynthesis predictions are becoming practical and LocalRetro is just one example. Automated retrosynthesis like RetroExplainer,¹⁰⁾ GraphRXN¹¹⁾ and other models offers us an opportunity to rationally construct a synthetic plan in high accuracy.
Even though most models are not evaluated in a generative manner so far, DL-based models could be prepared to design and incorporate a new potential reaction in a retrosynthesis as well.

We would be glad to have a chance to test the feasibility of retrosynthesis prediction that matches PepMetics molecules. Please feel free to contact us if you are interested in this topic.

https://doi.org/10.1039%2FCS9881700111
https://doi.org/10.1002/anie.199104553
The Logic of Chemical Synthesis. New York: Wiley. ISBN 978-0-471-11594-6
https://doi.org/10.1021%2Fjacs.0c04715
https://doi.org/10.1126%2Fscience.aat2299
https://doi.org/10.1016%2Fj.chempr.2020.02.017
https://doi.org/10.1109%2FTNNLS.2020.2978386
https://doi.org/10.1002%2Fwcms.1604
https://doi.org/10.1021/jacsau.1c00246
https://doi.org/10.1038/s41467-023-41698-5
https://doi.org/10.1186/s13321-023-00732-w

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Blog

PRISM is open to all levels of scientific discussion. Our scientists would love to speak with you!