Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct 26;2(10):725-732.
doi: 10.1021/acscentsci.6b00219. Epub 2016 Oct 14.

Neural Networks for the Prediction of Organic Chemistry Reactions

Affiliations

Neural Networks for the Prediction of Organic Chemistry Reactions

Jennifer N Wei et al. ACS Cent Sci. .

Abstract

Reaction prediction remains one of the major challenges for organic chemistry and is a prerequisite for efficient synthetic planning. It is desirable to develop algorithms that, like humans, "learn" from being exposed to examples of the application of the rules of organic chemistry. We explore the use of neural networks for predicting reaction types, using a new reaction fingerprinting method. We combine this predictor with SMARTS transformations to build a system which, given a set of reagents and reactants, predicts the likely products. We test this method on problems from a popular organic chemistry textbook.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
An overview of our method for predicting reaction type and products. A reaction fingerprint, made from concatenating the fingerprints of reactant and reagent molecules, is the input for a neural network that predicts the probability of 17 different reaction types, represented as a reaction type probability vector. The algorithm then predicts a product by applying to the reactants a transformation that corresponds to the most probable reaction type. In this work, we use a SMARTS transformation for the final step.
Figure 2
Figure 2
Cross validation results for (a) baseline fingerprint, (b) Morgan reaction fingerprint, and (c) neural reaction fingerprint. A confusion matrix shows the average predicted probability for each reaction type. In these confusion matrices, the predicted reaction type is represented on the vertical axis, and the correct reaction type is represented on the horizontal axis. These figures were generated on the basis of code from Schneider et al.
Figure 3
Figure 3
Wade problems (a) 8-47 and (b) 8-48.
Figure 4
Figure 4
Prediction results for (a) Wade problem 8-47 and (b) Wade problem 8-48, as displayed by estimated probability of correct reaction type. Darker (greener) colors represent a higher predicted probability. Note the large amount of correct predictions in 8-47.
Figure 5
Figure 5
Product predictions for Wade 8-47 questions, with Tanimoto score. The true product is the product as defined by the answer key. The major predicted product shows the product of the reaction type with the highest probability according to the Morgan fingerprint algorithm’s result. The Morgan weighted score and the neural weighted score are calculated by taking an average of the Tanimoto scores over all the predicted products weighted by the probability of that reaction type which generated that product.

References

    1. Todd M. H. Computer-aided organic synthesis. Chem. Soc. Rev. 2005, 34, 247.10.1039/b104620a. - DOI - PubMed
    1. Szymkuć S.; Gajewska E. P.; Klucznik T.; Molga K.; Dittwald P.; Startek M.; Bajczyk M.; Grzybowski B. A. Computer-Assisted Synthetic Planning: The End of the Beginning. Angew. Chem., Int. Ed. 2016, 55, 5904–5937. 10.1002/anie.201506101. - DOI - PubMed
    1. Corey E. J. Centenary lecture. Computer-assisted analysis of complex synthetic problems. Q. Rev., Chem. Soc. 1971, 25, 455–482. 10.1039/qr9712500455. - DOI
    1. Corey E.; Wipke W. T.; Cramer R. D. III; Howe W. J. Techniques for perception by a computer of synthetically significant structural features in complex molecules. J. Am. Chem. Soc. 1972, 94, 431–439. 10.1021/ja00757a021. - DOI
    1. Corey E.; Howe W. J.; Orf H.; Pensak D. A.; Petersson G. General methods of synthetic analysis. Strategic bond disconnections for bridged polycyclic structures. J. Am. Chem. Soc. 1975, 97, 6116–6124. 10.1021/ja00854a026. - DOI