Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 8;12(1):42.
doi: 10.1186/s13321-020-00446-3.

Chemical space exploration based on recurrent neural networks: applications in discovering kinase inhibitors

Affiliations

Chemical space exploration based on recurrent neural networks: applications in discovering kinase inhibitors

Xuanyi Li et al. J Cheminform. .

Abstract

With the rise of artificial intelligence (AI) in drug discovery, de novo molecular generation provides new ways to explore chemical space. However, because de novo molecular generation methods rely on abundant known molecules, generated molecules may have a problem of novelty. Novelty is important in highly competitive areas of medicinal chemistry, such as the discovery of kinase inhibitors. In this study, de novo molecular generation based on recurrent neural networks was applied to discover a new chemical space of kinase inhibitors. During the application, the practicality was evaluated, and new inspiration was found. With the successful discovery of one potent Pim1 inhibitor and two lead compounds that inhibit CDK4, AI-based molecular generation shows potentials in drug discovery and development.

Keywords: Chemical space; De novo molecular generation; Kinase inhibitors; Recurrent neural networks.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Chemical space exploration with RNNs. a Direct chemical space exploitation around known active molecules. Generator_1 is used to generate SMILES sequences, after being trained with SMILES sequences of active molecules. b Chemical space exploration for an unknown space. RNNs and virtual screening are combined to realize virtual screening based on local chemical space. Molecules are generated with generator_2, which has been trained through TL
Fig. 2
Fig. 2
Training and molecule generation based on canonical SMILES sequences and randomized SMILES sequences. a The convergence of both models; the loss values were recorded every 200 steps. b, c Similarity of newly generated unique molecules to their closest inhibitors after training with canonical SMILES sequences (b) or randomized SMILES sequences (c) for 1000 steps and 2000 steps. d, e t-SNE plots of combined unique molecules generated through sampling twice after training with canonical SMILES sequences (d) or randomized SMILES sequences (e). 2-D coordinates of CDK4 inhibitors, Pim1 inhibitors and newly generated molecules are colored blue, green and yellow, respectively
Fig. 3
Fig. 3
Dose-response curves of MJ-1055 on CDK4 (a) and Pim1 (b). For each concentration, tests were performed with at least two replicates
Fig. 4
Fig. 4
The filtered molecules after the first round of virtual screening with the pharmacophore model for Pim1 and the molecular docking model for CDK4 and -PMF. a t-SNE plot of CDK4 inhibitors (blue), Pim1 inhibitors (green), abemaciclib (red), inactive molecules (magenta) and screened molecules (yellow). b Number of the most similar molecules compared to the seven target molecules
Fig. 5
Fig. 5
The filtered chemical space after the second round of virtual screening with the pharmacophore model for Pim1 and the molecular docking model for CDK4 and -PMF. a t-SNE plot of CDK4 inhibitors (blue), Pim1 inhibitors (green), abemaciclib (red), inactive molecules (magenta) and screened molecules (yellow). b Number of the most similar molecules compared to the seven target molecules
Fig. 6
Fig. 6
Representative molecules in the transferred chemical space moving away from CHEMBL1802355. Molecules from left to right are CHEMBL1802355, the most similar molecule of CHEMBL1802355 and the cluster center of the chemical space to which CHEMBL1802355 belongs
Fig. 7
Fig. 7
t-SNE plot of the screened molecules. Coordinates of molecules directly screened by the pharmacophore model and the molecular docking model built for CDK4 are labeled in red. After chemical space extension of 10 molecules with higher -PMF scores and the second round of virtual screening, the molecules whose coordinates are in green were finally screened. The compounds finally obtained and tested were flagged with their ID number

References

    1. Lipinski C, Hopkins A. Navigating chemical space for biology and medicine. Nature. 2004;432(7019):855–861. - PubMed
    1. Mullard A. The drug-maker’s guide to the galaxy. Nature. 2017;549(7673):445–447. - PubMed
    1. Baig MH, Ahmad K, Roy S, Ashraf JM, Adil M, Siddiqui MH, Khan S, Kamal MA, Provaznik I, Choi I. Computer aided drug design: success and limitations. Curr Pharm Des. 2016;22(5):572–581. - PubMed
    1. Schneider G. Automating drug discovery. Nat Rev Drug Discov. 2018;17(2):97–113. - PubMed
    1. Saikin SK, Kreisbeck C, Sheberla D, Becker JS, Aspuru-Guzik A. Closed-loop discovery platform integration is needed for artificial intelligence to make an impact in drug discovery. Expert Opin Drug Discov. 2019;14(1):1–4. - PubMed

LinkOut - more resources