Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Nov 9;26(22):6761.
doi: 10.3390/molecules26226761.

Artificial Intelligence for Autonomous Molecular Design: A Perspective

Affiliations
Review

Artificial Intelligence for Autonomous Molecular Design: A Perspective

Rajendra P Joshi et al. Molecules. .

Abstract

Domain-aware artificial intelligence has been increasingly adopted in recent years to expedite molecular design in various applications, including drug design and discovery. Recent advances in areas such as physics-informed machine learning and reasoning, software engineering, high-end hardware development, and computing infrastructures are providing opportunities to build scalable and explainable AI molecular discovery systems. This could improve a design hypothesis through feedback analysis, data integration that can provide a basis for the introduction of end-to-end automation for compound discovery and optimization, and enable more intelligent searches of chemical space. Several state-of-the-art ML architectures are predominantly and independently used for predicting the properties of small molecules, their high throughput synthesis, and screening, iteratively identifying and optimizing lead therapeutic candidates. However, such deep learning and ML approaches also raise considerable conceptual, technical, scalability, and end-to-end error quantification challenges, as well as skepticism about the current AI hype to build automated tools. To this end, synergistically and intelligently using these individual components along with robust quantum physics-based molecular representation and data generation tools in a closed-loop holds enormous promise for accelerated therapeutic design to critically analyze the opportunities and challenges for their more widespread application. This article aims to identify the most recent technology and breakthrough achieved by each of the components and discusses how such autonomous AI and ML workflows can be integrated to radically accelerate the protein target or disease model-based probe design that can be iteratively validated experimentally. Taken together, this could significantly reduce the timeline for end-to-end therapeutic discovery and optimization upon the arrival of any novel zoonotic transmission event. Our article serves as a guide for medicinal, computational chemistry and biology, analytical chemistry, and the ML community to practice autonomous molecular design in precision medicine and drug discovery.

Keywords: artificial intelligence; autonomous workflow; computational modeling and simulations; computer aided drug discovery; deep learning; machine learning; machine reasoning and causal inference and causal reasoning; quantum mechanics and quantum computing; therapeutic design.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Closed-loop workflow for computational autonomous molecular design (CAMD) for medical therapeutics. Individual components of the workflow are labeled. It consists of data generation, feature extraction, predictive machine learning and an inverse molecular design engine.
Figure 2
Figure 2
Molecular representation with all possible formulation used in the literature for predictive and generative modeling.
Figure 3
Figure 3
The iterative update process used for learning a robust molecular representation either based on 2D SMILES or 3D optimized geometrical coordinates from physics-based simulations. The molecular graph is usually represented by features at the atomic level, bond level, and global state, which represents the key properties. Each of these features are iteratively updated during the representation learning phase, which are subsequently used for the predictive part of model.
Figure 4
Figure 4
Physics-informed ML framework for predictive modeling. It takes into account the properties obtained from quantum mechanics-based simulation or from experimental data to ultimately generate features in addition to the standard process used in benchmark models (e.g., message passing neural network (MPNN).
Figure 5
Figure 5
Generative model such as 3D-scaffold [69] can be used to inverse design novel candidates with desired target properties starting from core scaffold or functional group.
Figure 6
Figure 6
Molecular modeling methods used to study protein–ligand interactions including molecular docking simulations, molecular mechanics methods, hybrid Quantum Mechanics/Molecular Mechanics simulations, and deep learning models for the activity and affinity prediction.

References

    1. Zhavoronkov A., Ivanenkov Y.A., Aliper A., Veselov M.S., Aladinskiy V.A., Aladinskaya A.V., Terentiev V.A., Polykovskiy D.A., Kuznetsov M.D., Asadulaev A., et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 2019;37:1038–1040. doi: 10.1038/s41587-019-0224-x. - DOI - PubMed
    1. Kackar R.N. Off-Line Quality Control, Parameter Design, and the Taguchi Method. J. Qual. Technol. 1985;17:176–188. doi: 10.1080/00224065.1985.11978964. - DOI
    1. Kim E., Huang K., Tomala A., Matthews S., Strubell E., Saunders A., McCallum A., Olivetti E. Machine-learned and codified synthesis parameters of oxide materials. Sci. Data. 2017;4:1–9. doi: 10.1038/sdata.2017.127. - DOI - PMC - PubMed
    1. Leelananda S.P., Lindert S. Computational methods in drug discovery. Beilstein J. Org. Chem. 2016;12:2694–2718. doi: 10.3762/bjoc.12.267. - DOI - PMC - PubMed
    1. DiMasi J.A., Grabowski H.G., Hansen R.W. Innovation in the pharmaceutical industry: New estimates of R&D costs. J. Health Econ. 2016;47:20–33. - PubMed

LinkOut - more resources