Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan;37(1-2):1700153.
doi: 10.1002/minf.201700153. Epub 2018 Jan 10.

De Novo Design of Bioactive Small Molecules by Artificial Intelligence

Affiliations

De Novo Design of Bioactive Small Molecules by Artificial Intelligence

Daniel Merk et al. Mol Inform. 2018 Jan.

Abstract

Generative artificial intelligence offers a fresh view on molecular design. We present the first-time prospective application of a deep learning model for designing new druglike compounds with desired activities. For this purpose, we trained a recurrent neural network to capture the constitution of a large set of known bioactive compounds represented as SMILES strings. By transfer learning, this general model was fine-tuned on recognizing retinoid X and peroxisome proliferator-activated receptor agonists. We synthesized five top-ranking compounds designed by the generative model. Four of the compounds revealed nanomolar to low-micromolar receptor modulatory activity in cell-based assays. Apparently, the computational model intrinsically captured relevant chemical and biological knowledge without the need for explicit rules. The results of this study advocate generative artificial intelligence for prospective de novo molecular design, and demonstrate the potential of these methods for future medicinal chemistry.

Keywords: Automation; drug discovery; machine learning; medicinal chemistry; nuclear receptor.

PubMed Disclaimer

Conflict of interest statement

G. S. declares a potential financial conflict of interest in his role as life‐science industry consultant and cofounder of inSili.com GmbH, Zurich.

Figures

Figure 1
Figure 1
Concept of generative artificial intelligence (AI). A model of the training data (e. g., molecular structures) is obtained that can be used to emit new instances (new chemical entities) within the training domain by sampling.
Figure 2
Figure 2
Chemical space analysis by multi‐dimensional scaling. Compounds were represented by Morgan substructure fingerprints (radius=0–4 bonds, length=1024 bit), and similarity was defined by the Jaccard‐Tanimoto index. Colored dots represent the training data (light grey), fine‐tuning set (green), known RXR (orange) and PPAR (blue) agonists, sampled molecules (dark grey), and the selected de novo designs 15 (red). Compounds 1, 2, 3 and 5 populate the same area as the known RXR and PPAR agonists, while 4 is similar to PPAR agonist but remote from known RXR actives.
Scheme 1
Scheme 1
Synthesis of designs 15. Reagents & conditions: (a) H2N−C6H4−COOH (7), EDC, 4‐DMAP, THF, reflux, 4 h; (b) C6H5−B(OH)2 (9), Pd(PPh3)4, Cs2CO3, dioxane, 100 °C, 16 h; (c) KOH, MeOH/THF/H2O, μw, 70 °C, 30 min; (d) HO‐C6H3F−B(OH)2 (12), Pd(PPh3)4, Cs2CO3, toluene/EtOH, 100 °C, 20 h; (e) F‐C6H4‐CH2‐Br (15), K2CO3, DMF, μw, 100 °C, 120 min; (f) MeOH, H2SO4cc, reflux, 4 h; (g) C5H9Br (18), K2CO3, DMF, μw, 100 °C, 6 h; (h) HO‐C6H4‐B(OH)2 (20), Pd(PPh3)4, Cs2CO3, toluene/EtOH, 100 °C, 16 h; (i) C6H4Cl‐C6H4‐COOH (24), EDC, 4‐DMAP, CHCl3, relux, 12 h; (j) C6H3Br(OH)2 (27), Pd(PPh3)4, Cs2CO3, dioxane/DMF, reflux, 4 h; (k) malonic acid, pyridine/piperidine, μw, 100 °C, 30 min.

References

    1. G. Schneider, Nat. Rev. Drug Discov 2018, doi: nrd.2017.232.
    1. Schneider P., Schneider G., J. Med. Chem. 2016, 59, 4077–4086. - PubMed
    1. None
    1. Schneider G., Fechner U., U., Nat. Rev. Drug Discov. 2005, 4, 649–663; - PubMed
    1. Hartenfeller M., Schneider G., Methods Mol. Biol., 2011, 672, 299–323. - PubMed

Publication types

MeSH terms

Substances