Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Sep 26;51(9):2174-85.
doi: 10.1021/ci2001428. Epub 2011 Aug 31.

Scaffold diversity of exemplified medicinal chemistry space

Affiliations
Free PMC article

Scaffold diversity of exemplified medicinal chemistry space

Sarah R Langdon et al. J Chem Inf Model. .
Free PMC article

Abstract

The scaffold diversity of 7 representative commercial and proprietary compound libraries is explored for the first time using both Murcko frameworks and Scaffold Trees. We show that Level 1 of the Scaffold Tree is useful for the characterization of scaffold diversity in compound libraries and offers advantages over the use of Murcko frameworks. This analysis also demonstrates that the majority of compounds in the libraries we analyzed contain only a small number of well represented scaffolds and that a high percentage of singleton scaffolds represent the remaining compounds. We use Tree Maps to clearly visualize the scaffold space of representative compound libraries, for example, to display highly populated scaffolds and clusters of structurally similar scaffolds. This study further highlights the need for diversification of compound libraries used in hit discovery by focusing library enrichment on the synthesis of compounds with novel or underrepresented scaffolds.

PubMed Disclaimer

Figures

Figure 1
Figure 1
An interpretation of the Markush structure as described in the 1924 Markush patent.(3)
Figure 2
Figure 2
The HSP90 inhibitor NVP-AUY922 depicted using different scaffold representations.
Figure 3
Figure 3
Scaffold Tree analysis: cumulative scaffold frequency plot showing the distribution of compounds over Level 1 scaffolds in the ICRSC, VC, ICRFL, CHEMBL, DBSM, DBAD, and BIOFOC data sets.
Figure 4
Figure 4
Murcko framework analysis: cumulative scaffold frequency plot showing the distribution of compounds over Murcko framework scaffolds in the ICRSC, VC, ICRFL, CHEMBL, DBSM, DBAD, and BIOFOC data sets.
Figure 5
Figure 5
Examples of how compounds of different complexity are represented by Murcko frameworks and Level 1 scaffolds. Each molecule had n+1 Levels numbered sequentially from Level 0 (the single remaining ring) up to Level n (the whole molecule) where Level n-1 is the Murcko framework: compound a: A typical leadlike/druglike chemical structure; compound b: A typical fragmentlike chemical structure.
Figure 6
Figure 6
Example Tree Map. The colored circles represent scaffolds and are labeled with their scaffold frequency. The area and color of the circles relate to the scaffold frequency. Scaffold circles are grouped into gray circles if the scaffolds are in the same cluster.
Figure 7
Figure 7
Tree Map of the VC data set Level 1 scaffolds. Scaffolds are represented by colored circles, the area and color of the circles relate to the scaffold frequency, gray circles represent clusters of scaffolds. Tree Maps illustrate the large proportion of singleton scaffolds in the data sets (many small white circles) and the presence of highly populated scaffolds (few large green circles).
Figure 8
Figure 8
Tree Map of the DBSM data set Level 1 scaffolds. Scaffolds are represented by colored circles, the area and color of the circles relate to the scaffold frequency, gray circles represent clusters of scaffolds. Tree Maps illustrate the large proportion of singleton scaffolds in the data sets (many small white circles) and the presence of highly populated scaffolds (few large green circles).
Figure 9
Figure 9
Tree Map of the BIOFOC data set Level 1 scaffolds. Scaffolds are represented by colored circles, the area and color of the circles relate to the scaffold frequency, gray circles represent clusters of scaffolds. Tree Maps illustrate the large proportion of singleton scaffolds in the data sets (many small white circles) and the presence of highly populated scaffolds (few large green circles).

References

    1. Villar H. O.; Hansen M. R. Design of chemical libraries for screening. Expert Opin. Drug Discovery 2009, 4, 1215–1220. - PubMed
    1. Akritopoulou-Zane I.; Hajduk P. J. Kinase-targeted libraries: The design and synthesis of novel, potent, and selective kinase inhibitors. Drug Discovery Today 2009, 14, 291–297. - PubMed
    1. Markush E. A.Pyrazolone dye and process of making the same, Pharma Chemical Corp, Patent Number: 1506316, United States. 1924. The patent describes the Markush structure as “The yellow coloring matter which may be obtained by coupling to halogen-substitution products of pyrazolone, a diazotized unsulphonated material selected from the group consisting of aniline, homologues of aniline and halogen substitution products of aniline”.
    1. Leach A. R.; Gillet V. J.. An Introduction to Chemoinformatics, Revised ed.; Springer: Dordrecht, 2007.
    1. Brough P. A; Aherne A.; Barril X.; Borgognoni J.; Boxall K.; Cansfield J. E.; Cheung K.-M. J.; Collins I.; Davies N. G. M.; Drysdale M. J.; Dymock B.; Eccles S. A.; Finich H.; Fink A.; Hayes A; Howes R.; Hubbard R. E.; James K.; Jordan A. M.; Lockie A.; Martins V.; Massey A.; Matthews T. P.; McDonald E.; Northfield C. J.; Pearl L. H.; Prodromou C.; Ray S.; Raynaud F. I.; Roughley S. D.; Sharp S. Y.; Surgenor A.; Walmsley D. L.; Webb P.; Wood M.; Workman P.; Wright L. 4, 5-Diarylisoxazole Hsp90 Chaperone Inhibitors: Potential Therapeutic Agents for the Treatment of Cancer. J. Med. Chem. 2008, 51, 196–218. - PubMed

Publication types

MeSH terms

Substances