Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 10;2(3):663-673.
doi: 10.1039/d3dd00006k. eCollection 2023 Jun 12.

Quantum chemical data generation as fill-in for reliability enhancement of machine-learning reaction and retrosynthesis planning

Affiliations

Quantum chemical data generation as fill-in for reliability enhancement of machine-learning reaction and retrosynthesis planning

Alessandra Toniato et al. Digit Discov. .

Abstract

Data-driven synthesis planning has seen remarkable successes in recent years by virtue of modern approaches of artificial intelligence that efficiently exploit vast databases with experimental data on chemical reactions. However, this success story is intimately connected to the availability of existing experimental data. It may well occur in retrosynthetic and synthesis design tasks that predictions in individual steps of a reaction cascade are affected by large uncertainties. In such cases, it will, in general, not be easily possible to provide missing data from autonomously conducted experiments on demand. However, first-principles calculations can, in principle, provide missing data to enhance the confidence of an individual prediction or for model retraining. Here, we demonstrate the feasibility of such an ansatz and examine resource requirements for conducting autonomous first-principles calculations on demand.

PubMed Disclaimer

Conflict of interest statement

There are no conflicts to declare.

Figures

Fig. 1
Fig. 1. Workflow of QC-enhanced AI-based retrosynthesis planning: (A) a prediction of possible disconnections is made for a target molecule. (B) A confidence score is computed for these predictions. Some of these suggestions are potentially correct but predicted with low confidence by the AI model (in IBM RXN) due to a lack of training data. (C) First-principles reactivity explorations (with SCINE Chemoton) are initiated to validate or invalidate the predictions (or a subset of them). (D) The original confidence score and the result of the first-principles explorations are combined to decide which predictions should be adopted for the synthesis planning. (E) The above procedure is iterated for the next steps.
Fig. 2
Fig. 2. Williamson ether synthesis of ethoxybenzene from iodoethane and phenol. Both the non-mapped and the atom-mapped Lewis structures are shown alongside the corresponding SMILES representations.
Fig. 3
Fig. 3. Friedel–Crafts acylation reaction.
Fig. 4
Fig. 4. Left: Similarity of all products present in the training dataset and linked to a Friedel–Crafts acylation reaction, plotted against the product of the considered Friedel–Crafts reaction of Fig. 3. Right: The 5 molecules with the highest similarity score against the target compound (top left of the grid). Note that the product with the highest similarity is the one most confusing (F-group in para position vs. meta).
Fig. 5
Fig. 5. Elementary steps of the Friedel–Crafts reaction shown in Fig. 3.
Fig. 6
Fig. 6. Multistep retrosynthesis algorithm logic, taken from Schwaller et al.
Fig. 7
Fig. 7. Graphical user interface view for the result of the retrosynthetic analysis of the product molecule of Section 3.2. Left: selection of the first reaction step, which is characterized by a low confidence. Right: global view for the full retrosynthetic route. The low confidence score is caused by the first reaction step.

References

    1. Shen J. Nicolaou C. A. Drug Discovery Today: Technol. 2019;32–33:29–36. doi: 10.1016/j.ddtec.2020.05.001. - DOI - PubMed
    1. Schwaller P., Vaucher A. C., Laino T. and Reymond J.-L., ChemRxiv, 2020, preprint, 10.26434/chemrxiv.12758474.v2 - DOI
    1. Elton D. C. Boukouvalas Z. Fuge M. D. Chung P. W. Mol. Syst. Des. Eng. 2019;4:828–849. doi: 10.1039/C9ME00039A. - DOI
    1. Meyers J. Fabian B. Brown N. Drug Discovery Today. 2021;26:2707–2715. doi: 10.1016/j.drudis.2021.05.019. - DOI - PubMed
    1. Segler M. H. S. Waller M. P. Chem.–Eur. J. 2017;23:5966–5971. doi: 10.1002/chem.201605499. - DOI - PubMed

LinkOut - more resources