Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 7;8(1):16469.
doi: 10.1038/s41598-018-34677-0.

Scaffold-Hopping from Synthetic Drugs by Holistic Molecular Representation

Affiliations

Scaffold-Hopping from Synthetic Drugs by Holistic Molecular Representation

Francesca Grisoni et al. Sci Rep. .

Abstract

The discovery of novel ligand chemotypes allows to explore uncharted regions in chemical space, thereby potentially improving synthetic accessibility, potency, and the drug-likeness of molecules. Here, we demonstrate the scaffold-hopping ability of the new Weighted Holistic Atom Localization and Entity Shape (WHALES) molecular descriptors compared to seven state-of-the-art molecular representations on 30,000 compounds and 182 biological targets. In a prospective application, we apply WHALES to the discovery of novel retinoid X receptor (RXR) modulators. WHALES descriptors identified four agonists with innovative molecular scaffolds, populating uncharted regions of the chemical space. One of the agonists, possessing a rare non-acidic chemotype, revealed high selectivity on 12 nuclear receptors and comparable efficacy as bexarotene on induction of ATP-binding cassette transporter A1, angiopoietin like protein 4 and apolipoprotein E. The outcome of this research supports WHALES as an innovative tool to explore novel regions of the chemical space and to detect novel bioactive chemotypes by straightforward similarity searching.

PubMed Disclaimer

Conflict of interest statement

G.S. declares a potential financial conflict of interest in his role as life science industry consultant and cofounder of inSili.com GmbH, Zurich.

Figures

Figure 1
Figure 1
Simplified representation of WHALES calculation, taking the example of bexarotene. (a) Input chemical information for WHALES calculation, i.e., three-dimensional coordinates and partial charges. (b) Computed atom-centred interatomic distances for two pairs of atoms. The distances are normalized according to the atom-centred covariance (here depicted as an ellipsoid whose main axes are the directions of maximum variance), computed by considering the distribution of atoms and charges in the three-dimensional space (see Eq. 1). (c) Atom-centred covariance matrix (ACM), containing all the pairwise distances computed from each atomic centre (column) to each other atom (row). Only non-hydrogen atoms are considered. (d) Frequency distribution of remoteness (Rem) and isolation degree (Is) of the molecule, computed as row average and column minimum (diagonal elements excluded) of the ACM, respectively. Negatively charged atoms are assigned a negative sign of remoteness and isolation degree. (e) WHALES descriptors, computed as deciles (from d1 to d9, plus minimum and maximum) of remoteness, isolation degree and their ratio (IR), obtaining in total 33 molecular-size-independent descriptors (WHALES).
Figure 2
Figure 2
Retrospective virtual screening on known bioactives. 30,000 ChEMBL bioactive compounds (IC/EC50, Kd, Ki values < 1 μM) on 182 biological targets were used for virtual screening with three versions of WHALES (GM, DFTB+, shape) and seven state-of-the-art molecular descriptors. (a) Relative scaffold diversity of actives for each descriptor on each dataset, expressed as the ratio of differing scaffolds to the number of retrieved actives among the top 5% portion of the respective screening runs. Boxplots show the median (line), mean (white dot), standard deviation (box edges), 5th and 95th percentiles (whiskers); grey dots represent outliers; asterisks denote the minimum value. WHALES descriptors produced a significantly higher relative scaffold diversity of actives (p < 0.01, Kruskal-Wallis with Dunn’s post-hoc analysis), except for WHALES-GM and WHALES-DFTB+ compared to WHIM (p = 1.00); (b) Principal Component Analysis (PCA) performed on the SDA% values obtained by each descriptor on each biological target (first two PCs depicted, E.V. = explained variance). B and W denote the highest and lowest value produced by the pool of descriptors on each biological receptor; the dashed line represents the variation from the worst to the best relative scaffold diversity on average. Descriptors (circles) are coloured according to their mean SDA%, from white (low) to blue (high). WHALES descriptors (dashed circle) have the largest SDA% on average. (c) Comparison between the enrichment factor (EF1%) of WHALES-GM and WHALES-DFTB+. Blue dots represent the cases where the SDA% of WHALES-GM in the top 1% of the list was more than 3% larger than WHALES-DFTB+. In no case the SDA% of WHALES -DFTB+was more than 3% larger than that of WHALES-GM. (d) Comparison between the enrichment factor (EF1%) of WHALES-GM and WHALES-shape. Blue dots represent the cases where the SDA% of WHALES-GM in the top 1% of the list was more than 3% larger than WHALES-shape; the opposite case is represented by orange asterisks; grey circles denote biological targets with similar SDA%. Molecular targets for which WHALES performed well in terms of enrichment are highlighted in (c) and (d) with the following labels: BDK = bradykinin receptor, BR = bombesin receptor, DNAgyr = DNA gyrase, NEU = neuraminidase, RXR = retinoid X receptor, STK = serine/threonine protein kinase (PIKK family).
Figure 3
Figure 3
Queries utilized for the WHALES-GM-based virtual screening on commercially available compounds. (a) Query structures, labelled according to the scaffold type (from 1 to 4), with Murcko scaffolds highlighted. (b) reduced scaffolds of the queries labelled with roman numerals (from i to iv). The reduced scaffolds i, ii and iii characterize 22%, 13% and 3% of the RXR actives annotated in ChEMBL23 (EC50/IC50 < 50 μM), respectively.
Figure 4
Figure 4
Analysis of the hits obtained with WHALES-GM on RXR receptors. (a) Scaffolds of the active hits identified by WHALES–GM (58, bold, cf. Table 1). None of these scaffolds was present in the ChEMBL23 annotated modulators. (b) Fragment analysis of hits and queries compared with known ChEMBL agonists (EC50 < 50 μM) and inactives (EC50, IC50, Ki, Kd > 50 μM) on RXR. A multi-dimensional scaling (MDS) was performed on the extended connectivity fingerprints (1024-bit, radius = 0 to 3 bonds, 2 bits per pattern). Colours represent the set considered (grey = active and inactive compounds from ChEMBL, blue = queries, orange = WHALES hits); active hits are labelled with their ID (cf. Table 1). (c) Lead-likeliness of ChEMBL agonists, queries and active hits evaluated according to octanol-water partitioning coefficient (SlogP), solubility (AlogS), molecular weight (MW) and number of rotatable bonds (nRB).
Figure 5
Figure 5
In silico and in vitro analysis of hit 7. (a) Drug approved RXR agonist bexarotene (9), which was used as the reference for the analysis; (b) Comparison between the predicted binding poses of 7 (orange) and bexarotene (blue) in the ligand binding site of RXRα. The crystal structure of RXRα in complex with the agonist 9cUAB30 and the coactivator peptide GRIP-1 (PDB-ID: 4K4J) was prepared in MOE (v2016.0802), following the default protein preparation protocol. Structure energy was minimized using Amber10:EHT force field. For each ligand (i.e., crystalized ligand, bexarotene and hit 7) 60 poses were generated, their energy was minimized using MMFF94x force field within a rigid receptor, and they were ranked by London dG score; the top 10 poses were refined and scored using GBVI/WSA dG and the top-scoring pose was chosen. 7 and bexarotene share a similar binding pose, with 7 missing the interaction with R316 due to its lack of an acidic feature. (c) Control experiment: In absence of a Gal4-RXR hybrid receptor, the Gal4-responsive reporter gene was not transactivated by 7 confirming RXR-mediated activity. (d) RXR ligand 7 is highly selective over twelve related nuclear receptors (peroxisome proliferator-activated receptor [PPARα/γ/δ], liver X receptor [LXRα/β], farnesoid X receptor [FXR], retinoic acid receptor [RARα/β/γ], Vitamin D Receptor [VDR], pregnane X receptor [PXR], constitutive androstane receptor [CAR]). (e) RXR modulator 7 induces RXR regulated genes ATP-binding cassette transporter A1 (ABCA1), angiopoietin like protein 4 (ANGPTL4) and Apolipoprotein E (ApoE) with an efficacy comparable to RXR agonist bexarotene.

Similar articles

Cited by

References

    1. Langdon SR, Ertl P, Brown N. Bioisosteric replacement and scaffold hopping in lead generation and optimization. Mol. Inf. 2010;29:366–385. doi: 10.1002/minf.201000019. - DOI - PubMed
    1. Schneider G, Schneider P, Renner S. Scaffold-hopping: how far can you jump? Mol. Inf. 2006;25:1162–1171.
    1. Todeschini, R. & Consonni, V. Molecular Descriptors for Chemoinformatics41 (Wiley VCH, 2009).
    1. Bleicher KH, Böhm H-J, Müller K, Alanine AI. Hit and lead generation: beyond high-throughput screening. Nat. Rev. Drug Discov. 2003;2:369–378. doi: 10.1038/nrd1086. - DOI - PubMed
    1. Srinivas Reddy A, Priyadarshini Pati S, Praveen Kumar P, Pradeep HN, Narahari Sastry G. Virtual screening in drug discovery-a computational perspective. Curr. Protein Pept. Sci. 2007;8:329–351. doi: 10.2174/138920307781369427. - DOI - PubMed

Publication types