Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Nov 2;107(44):18787-92.
doi: 10.1073/pnas.1012741107. Epub 2010 Oct 18.

Small molecules of different origins have distinct distributions of structural complexity that correlate with protein-binding profiles

Affiliations

Small molecules of different origins have distinct distributions of structural complexity that correlate with protein-binding profiles

Paul A Clemons et al. Proc Natl Acad Sci U S A. .

Abstract

Using a diverse collection of small molecules generated from a variety of sources, we measured protein-binding activities of each individual compound against each of 100 diverse (sequence-unrelated) proteins using small-molecule microarrays. We also analyzed structural features, including complexity, of the small molecules. We found that compounds from different sources (commercial, academic, natural) have different protein-binding behaviors and that these behaviors correlate with general trends in stereochemical and shape descriptors for these compound collections. Increasing the content of sp(3)-hybridized and stereogenic atoms relative to compounds from commercial sources, which comprise the majority of current screening collections, improved binding selectivity and frequency. The results suggest structural features that synthetic chemists can target when synthesizing screening collections for biological discovery. Because binding proteins selectively can be a key feature of high-value probes and drugs, synthesizing compounds having features identified in this study may result in improved performance of screening collections.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Study design to relate structural complexity to protein-binding profiles. Three sources of compounds were studied; diverse samples of each subset are shown to illustrate differences between the subsets (all structures in the study are presented in Dataset S1 and Dataset S2).
Fig. 2.
Fig. 2.
Stereochemical complexity of compounds from three sources. Complexity is expressed as the proportion of all carbon atoms that are stereogenic carbon atoms (Cstereogenic/Ctotal).
Fig. 3.
Fig. 3.
Shape complexity of compounds from three sources. Complexity is expressed as the proportion of sp2- or sp3-hybridized carbon atoms that are sp3-hybridized (Csp3/[Csp2 + Csp3]); (A) whole molecules, (B) “scaffold” atoms in the molecular framework only.
Fig. 4.
Fig. 4.
Binary hit calls for compounds from three sources against 100 proteins. Heatmap depicts presence (white) or absence (black) of a hit call for all compounds scoring as hits against at least one protein. Colored bars indicate source of compounds: CC (red), NP (green), DC (blue).
Fig. 5.
Fig. 5.
Hit-rate analysis of compounds from three sources in 100 protein-binding assays. Box-whisker plots depict second and third quartiles (blue boxes) above and below median values (red lines) with adjacent values indicating maximum nonoutlier values (black whiskers) and outlier values (red crosses); see legend at right. These data show the greatest hit rates for DC and lowest for NP (see text).
Fig. 6.
Fig. 6.
Analysis of binding promiscuity among hit compounds. Three promiscuity categories were evaluated for compounds scored as hits against at least one protein in 100 protein-binding assays: binding to 1–5 proteins (white), 6–24 proteins (gray), 25+ proteins (black); numbers of compounds with significant enrichment (**) or depletion (*), relative to the overall proportion (far left), are indicated. These data show (A) CC members are most likely to bind 6+ proteins, NP members least likely, and DC members intermediate; (B) reevaluation after removing spiroxindole-based compounds derived from one synthetic pathway in DC to create DC (see text) suggests that much promiscuous binding among DC can be attributed to a single class of compound.
Fig. 7.
Fig. 7.
Analysis of binding specificity among hit compounds. Three specificity categories were evaluated for compounds scored as hits against at least one protein in 100 protein-binding assays: binding to exactly one protein (white), 2–5 proteins (gray), 6+ proteins (black); numbers of compounds indicating significant enrichment (**) or depletion (*), relative to the overall proportion (far left), are indicated. These data show both DC and NP members are most likely to bind exactly one protein, while significant fractions of CC members bind at least two proteins (see text).
Fig. 8.
Fig. 8.
Connection between binding specificity and stereochemical complexity. Four specificity categories (including non-hits) were evaluated: binding to 0 proteins (light gray), exactly 1 protein (white), 2–5 proteins (gray), 6+ proteins (black); numbers of compounds indicating significant enrichment (**) or depletion (*) relative to proportional representation are indicated. These data show stereochemically simple compounds most likely bind multiple proteins, intermediate complexity compounds most likely bind exactly 1 protein, and the most complex compounds most likely bind 0 proteins (see text).

References

    1. Iwasa J, Fujita T, Hansch C. Substituent constants for aliphatic functions obtained from partition coefficients. J Med Chem. 1965;8:150–153. - PubMed
    1. Fujita T, Hansch C. Analysis of the structure-activity relationship of the sulfonamide drugs using substituent constants. J Med Chem. 1967;10:991–1000. - PubMed
    1. Feher M, Schmidt JM. Property distributions: Differences between drugs, natural products, and molecules from combinatorial chemistry. J Chem Inf Comput Sci. 2003;43:218–227. - PubMed
    1. Ertl P, Schuffenhauer A. Cheminformatics analysis of natural products: lessons from nature inspiring the design of new drugs. Prog Drug Res. 2008;66:218–235. - PubMed
    1. Singh N, et al. Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J Chem Inf Model. 2009;49:1010–1024. - PMC - PubMed

Publication types