Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 9;12(41):13664-13675.
doi: 10.1039/d1sc04444c. eCollection 2021 Oct 27.

Structure-based de novo drug design using 3D deep generative models

Affiliations

Structure-based de novo drug design using 3D deep generative models

Yibo Li et al. Chem Sci. .

Abstract

Deep generative models are attracting much attention in the field of de novo molecule design. Compared to traditional methods, deep generative models can be trained in a fully data-driven way with little requirement for expert knowledge. Although many models have been developed to generate 1D and 2D molecular structures, 3D molecule generation is less explored, and the direct design of drug-like molecules inside target binding sites remains challenging. In this work, we introduce DeepLigBuilder, a novel deep learning-based method for de novo drug design that generates 3D molecular structures in the binding sites of target proteins. We first developed Ligand Neural Network (L-Net), a novel graph generative model for the end-to-end design of chemically and conformationally valid 3D molecules with high drug-likeness. Then, we combined L-Net with Monte Carlo tree search to perform structure-based de novo drug design tasks. In the case study of inhibitor design for the main protease of SARS-CoV-2, DeepLigBuilder suggested a list of drug-like compounds with novel chemical structures, high predicted affinity, and similar binding features to those of known inhibitors. The current version of L-Net was trained on drug-like compounds from ChEMBL, which could be easily extended to other molecular datasets with desired properties based on users' demands and applied in functional molecule generation. Merging deep generative models with atomic-level interaction evaluation, DeepLigBuilder provides a state-of-the-art model for structure-based de novo drug design and lead optimization.

PubMed Disclaimer

Conflict of interest statement

There are no conflicts to declare.

Figures

Fig. 1
Fig. 1. The architecture of DeepLigBuilder, which contains two components: a 3D molecule generative network called L-Net (a) and an optimization module based on MCTS (b). (a) The architecture of L-Net. L-Net generates 3D molecular structures by iteratively editing the structure. A state encoder is used to analyze the existing structure, and a policy network is used for decision-making. (b) A schematic diagram of L-Net combined with MCTS to optimize drug-like molecules inside the protein binding pocket.
Fig. 2
Fig. 2. Quantitative evaluations of L-Net. (a–d) Performance of L-Net measured by (a) the percentage of valid outputs; (b) RMSD values after optimization; (c) 2D MMD; and (d) 2D precision (pink) and recall (blue). Rows indicate different hyperparameter selections. Comparison with G-SchNet is shown in green (temperature set to 1.0 during comparison). (e and f) Distribution of (e) QED and (f) molecular weight among generated (blue) and test set (grey) molecules. (g) Shape distribution of generated (left) and test set (right) molecules, visualized using NPR descriptors. (h) The MMD values of the torsion distribution for each torsion pattern, ranked from lowest to highest. (i) Torsion distributions with median-level MMD values, blue: generated molecules, grey: test set molecules.
Fig. 3
Fig. 3. Lead optimization using DeepLigBuilder. (a) The topological structure of MI-23, a reported covalent inhibitor targeting SARS-CoV-2 Mpro. The blue part of the molecule is used as a seed structure by DeepLigBuilder for molecule generation. (b) The range of the scores for the 10 best molecules sampled at each step during the Monte Carlo tree search. Blue: MCTS turned on; grey: MCTS turned off. (c) The distribution of the smina score, QED, and SAscore of high-quality samples. (d) The distribution of hydrophobic groups in high-quality samples. (e) The distribution of hydrogen bond acceptors in high-quality samples. (f–h) Examples of molecules generated by DeepLigBuilder. The binding poses of the generated molecules (grey) are aligned with MI-23 (light grey) for comparison. The residue Glu166 is shown in blue. Molecular properties and predicted binding affinities are listed in the grey boxes.
Fig. 4
Fig. 4. (a–c) The structure and binding pose of compound 5, a known noncovalent inhibitor of SARS-CoV-2 Mpro. (a) The binding pose of compound 5 inside Mpro. (b) The topological structure of compound 5; the part of the molecule colored in blue is used as a seed structure. (c) Key interaction between Mpro and compound 5. (d) The cumulative number of generated samples with smina score < −9 kcal mol−1 as a function of the number of MCTS steps performed: blue line, the median value across successful runs; light blue areas, the 10–90th and 25–75th percentiles. (e) The distributions of scaffolds generated from the first run (red), the 2nd to 32nd runs (dark blue), and the 33rd to 64th runs (light blue). (f) The distribution of the smina score, QED, and SAscore of the molecules generated with DeepLigBuilder. (g) Several important structural features of compound 5. (h) The distribution of hydrophobic groups of the generated molecules. (i) The distribution of hydrogen bond acceptors of the generated molecules.
Fig. 5
Fig. 5. (a) Examples of DeepLigBuilder generated scaffolds. (b) Analysis of three promising compounds generated. The first column: topological structures of generated molecules. The second column: molecular properties (QED, SAscore) and predicted binding affinity. The third column: interactions between the generated molecules and the protein. The last column: comparison between the locally minimized (dark grey) and globally minimized poses (light grey) of the generated molecules, as well as the binding pose of compound 5 (blue).

Similar articles

Cited by

References

    1. Bohacek R. S. McMartin C. Guida W. C. Med. Res. Rev. 1996;16:3–50. doi: 10.1002/(SICI)1098-1128(199601)16:1<3::AID-MED1>3.0.CO;2-6. - DOI - PubMed
    1. Schneider G. Fechner U. Nat. Rev. Drug Discovery. 2005;4:649–663. doi: 10.1038/nrd1799. - DOI - PubMed
    1. Schneider G. J. Comput. Aided Mol. Des. 2012;26:115–120. doi: 10.1007/s10822-011-9485-2. - DOI - PubMed
    1. Schneider G. Clark D. E. Angew. Chem., Int. Ed. 2019;58:10792–10803. doi: 10.1002/anie.201814681. - DOI - PubMed
    1. Irwin J. J. Shoichet B. K. J. Med. Chem. 2016;59:4103–4120. doi: 10.1021/acs.jmedchem.5b02008. - DOI - PMC - PubMed