Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Apr 24;57(4):875-882.
doi: 10.1021/acs.jcim.6b00754. Epub 2017 Apr 3.

Chemical Space Mimicry for Drug Discovery

Affiliations

Chemical Space Mimicry for Drug Discovery

William Yuan et al. J Chem Inf Model. .

Abstract

We describe a new library generation method, Machine-based Identification of Molecules Inside Characterized Space (MIMICS), that generates sets of molecules inspired by a text-based input. MIMICS-generated libraries were found to preserve distributions of properties while simultaneously increasing structural diversity. Newly identified MIMICS-generated compounds were found to be bioactive as inhibitors of specific components of the unfolded protein response (UPR) and the VEGFR2 pathway in cell-based assays, thus confirming the applicability of this methodology toward drug design applications. Wider application of MIMICS could facilitate the efficient utilization of chemical space.

PubMed Disclaimer

Conflict of interest statement

Notes

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Structural novelty comparison. (A) Bemis–Murcko clustering was conducted on the MIMICS and input molecule sets to assess the diversity and novelty of central structural motifs. The number of unique scaffolds produced as a function of MIMICS molecules generated is displayed. (B) The Tanimoto distance between a particular structure and its nearest neighbor in the input set was computed using the Open Babel FP2 fingerprint for samples of MIMICS and input molecules. (C) Nearest-neighbor distance histogram for MIMICS molecules and input molecules relative to the input. (D) Nearest-neighbor distance histogram for MIMICS molecules and input molecules relative to themselves.
Figure 2
Figure 2
Comparison with input. MIMICS (blue) and input (orange) molecules are compared structurally and descriptively. (A) Normalized PMI ratio plots for each set of compounds were computed. The points labeled Median and Mode correspond to the median and mode coordinates of all points. Descriptive properties computed using PaDEL-descriptor include (B, C) numbers of hydrogen-bond acceptors and donors, (D) ring count, (E) rotatable bond count, (F) fraction of sp-hybridized carbons, (G) XLogP, (H) topological polar surface area, and (I) molecular weight (MW). For all of the computed descriptors, both the average values and overall distributions were preserved in going from the input set to the generated MIMICS set.
Figure 3
Figure 3
(A) Neuron activations for four different neurons. Letter colors indicate neuron activation at those particular letters, with green corresponding to positive activation and red corresponding to negative activation. Activations of three neurons with well-defined behavior and one without (out of 1538 neurons total) are displayed. Neuron 678 has been recolored because of the low magnitude of raw activations. (B, C) Mapping of neuron activations from SMILES to the molecular structure for neurons 1285 (B) and 678 (C).
Figure 4
Figure 4
Confirmation of bioactivity against the IRE1α/XBP1 pathway, a branch of the UPR. The HT1080 (human fibrosarcoma) cell line was stably transduced with an XBP1-luciferase reporter construct. (top) Generated SMILES expressions, (middle) structures, and (bottom) dose–response curves showing inhibitory action relative to CMV control toward IRE1α/XBP1 are presented for the two identified inhibitors, (A) STF-021898 and (B) STF-046304.
Figure 5
Figure 5
Frequency distributions of binding energies (in kCal/mol) for MIMICS and an existing screening library (Stanford High-Throughput Bioscience Center (HTBC) library). More negative values indicate more stable ligand–protein complexes and higher binding affinities.
Figure 6
Figure 6
Novel VEGFR-2 inhibitors inhibit HUVEC tube formation with improved potency compared with a known VEGFR-2 inhibitor and minimal nonspecific cytotoxicity to normal cells. (A) Five novel VEGFR-2 inhibitors were tested on HUVEC tube formation at a higher dose range of 0–20 μM. DMSO (solvent) was used as the control treatment. A known VEGFR-2 inhibitor, vatalanib, was used as a positive control and for potency comparison. (B) Two inhibitors that displayed the highest potencies in inhibiting tube formation at the higher dose range were tested at a lower dose range (1–1000 nM). (C, D) Bright-field images of the effects of the two most potent compounds on HUVEC tube formation at the lower dose range. (E) Normal human mammary epithelial cell line (MCF10A) was treated with the two most potent compounds at the lower dose range, and cell viability was assessed by trypan blue staining after 24 h. Data represent means of triplicate experiments. Error bars represent standard errors of the mean.

Similar articles

Cited by

References

    1. Virshup AM, Contreras-García J, Wipf P, Yang W, Beratan DN. Stochastic Voyages into Uncharted Chemical Space Produce a Representative Library of All Possible Drug-Like Compounds. J Am Chem Soc. 2013;135(19):7296–7303. - PMC - PubMed
    1. Rupakheti C, Virshup A, Yang W, Beratan DNJ. Strategy To Discover Diverse Optimal Molecules in the Small Molecule Universe. J Chem Inf Model. 2015;55(3):529–537. - PMC - PubMed
    1. Anderson E, Veith GD, Weininger D. SMILES: A Line Notation and Computerized Interpreter for Chemical Structures. Environmental Research Laboratory, U.S. Environmental Protection Agency; Duluth, MN: 1987. (Report EPA/600/M-87/021).
    1. Seiler KP, George GA, Happ MP, Bodycombe NE, Carrinski HA, Norton S, Brudz S, Sullivan JP, Muhlich J, Serrano M, Ferraiolo P, Tolliday NJ, Schreiber SL, Clemons PA. ChemBank: a Small-Molecule Screening and Cheminformatics Resource Database. Nucleic Acids Res. 2007;36:D351–9. - PMC - PubMed
    1. Karpathy A. Multi-layer Recurrent Neural Networks for character-level language models in Torch. 2015 https://github.com/karpathy/char-rnn (accessed September 15, 2016)

MeSH terms

Substances