Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Nov 1;24(21):2518-25.
doi: 10.1093/bioinformatics/btn479. Epub 2008 Sep 10.

Chemical substructures that enrich for biological activity

Affiliations

Chemical substructures that enrich for biological activity

Justin Klekota et al. Bioinformatics. .

Abstract

Motivation: Certain chemical substructures are present in many drugs. This has led to the claim of 'privileged' substructures which are predisposed to bioactivity. Because bias in screening library construction could explain this phenomenon, the existence of privilege has been controversial.

Results: Using diverse phenotypic assays, we defined bioactivity for multiple compound libraries. Many substructures were associated with bioactivity even after accounting for substructure prevalence in the library, thus validating the privileged substructure concept. Determinations of privilege were confirmed in independent assays and libraries. Our analysis also revealed 'underprivileged' substructures and 'conditional privilege'-rules relating combinations of substructure to bioactivity. Most previously reported substructures have been flat aromatic ring systems. Although we validated such substructures, we also identified three-dimensional privileged substructures. Most privileged substructures display a wide variety of substituents suggesting an entropic mechanism of privilege. Compounds containing privileged substructures had a doubled rate of bioactivity, suggesting practical consequences for pharmaceutical discovery.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Reportedly privileged substructures and related compounds. (A) Reportedly privileged substructures. 1,4-benzodiazepin-2-one (1), indole (2), purine (3), spiropiperidine (4) and biphenyl (5) are reported as privileged. The wavy line on spiropiperidine (4) indicates that ring can have variable composition and size. (B) Examples of endogenous molecules containing reportedly privileged substructures. ATP (6) contains purine, the amino acid tryptophan (7) contains indole, and nordiazepam (8), which occurs naturally in mammalian brains, contains 1,4-benzodiazepin-2-one. (C) Examples of drugs with reportedly privileged substructures. Devazepide (9) is a cholecystokinin A antagonist that contains indole and 1,4-benzodiazepin-2-one. Acyclovir (10), used to treat herpes, contains purine. Spiperone (11), a dopamine D2 antagonist, contains spiropiperidine. Diflunisal (12), an anti-inflammatory analgesic, contains biphenyl.
Fig. 2.
Fig. 2.
Discriminating substructures identified by the decision tree. The substructures were selected by the decision tree to discriminate active and inactive compounds in the Chembridge library. The ‘X’ symbol indicates a non-hydrogen atom, and hydrogen atoms (whether shown or implied) in ‘X’-containing substructures must be matched. All other substructures have unspecified patterns of hydrogen and non-hydrogen atom substitution. The symbols ‘+’ and ‘-’ indicate whether or not the node substructure is associated with an increase or decrease in compound activity relative to its parent node in the tree. Bold arrows pointing away from a substructure indicate its presence and dotted arrows indicate its absence. The substructure composition of each leaf (blue circle or red diamond) is constrained by the intersection of statements about the presence or absence of substructures traced from the tree root (node 1) to each leaf. The nodes containing the substructures are numbered and the fraction of active compounds is listed in each node and leaf. Leaves shown as blue circles are enriched in activity and leaves shown as red diamonds are depleted in activity relative to the entire library (18.4% of the library is active as indicated by the tree root, node 1). For space considerations, a subtree stemming from node 25 has been excluded (indicated by an enclosing box; see Supplementary Fig. S1 for this subtree). Supplementary Table S1 details the prevalence of selected substructures within the library as well as their enrichment in bioactivity when considered individually (without respect to the presence of any other substructure).
Fig. 3.
Fig. 3.
Validated discriminating substructures and related compounds. (A) Structures of quinoline, quinoxaline and quinazoline. Quinoline (13) (present at node 43) is associated with an increase in activity and similar to reported privileged substructures quinoxaline (14) and quinazoline (15). (B) Structure of vitamin K1, vitamin K2, naphthoquinone, 1,3-indandione and phenindione. Naphthoquinone (18) and 1,3-indandione (19), similar to vitamin K1 (16) and vitamin K2 (17), were identified as enriched in activity by the decision tree. Phenindione (20) which is an FDA-approved anti-coagulant containing 1,3-indandione is shown. (C) Structures of NSC636679, NSC634791 and NSC618757. NSC636679 (21), NSC634791 (22) and NSC618757 (23) contain the substructure at node 7 identified as enriched in the tree; these compounds inhibit cancer cell growth by inhibiting the ABCB1 (MDR1) membrane transport protein. (D) Structure of NIH. NIH is (24) a metal chelator containing the most discriminating substructure (node 1, root of the tree).
Fig. 4.
Fig. 4.
Enrichment of compounds for bioactivity based on substructure composition. (A) Recovery of active compounds in the 10% Chembridge Diverse Set E Test Set. A decision tree built using 90% of the Chembridge Diverse Set E library was used to rank the remaining 10% of compounds by rate of bioactivity expected given the substructure composition. The number of actives retrieved by the decision tree (wide black line) is 2–5 times greater than that retrieved by random selection indicated by the gray shaded area showing the mean ± 1 SD. (B) Recovery of compounds that inhibit cancer cell growth in the NCI Library. The original decision tree was used to rank NCI compounds according to the rate of activity against cancer cell growth expected given substructure composition. The number of actives retrieved by the decision tree rankings (wide black line) is 1.5–3 times greater than that retrieved by random selection indicated by the gray shaded area showing the mean ± 1 SD.

Similar articles

Cited by

References

    1. Andrews PR, Lloyd EJ. Molecular conformation and biological activity of central nervous system active drugs. Med. Res. Rev. 1982;2:355–393. - PubMed
    1. Ariens EJ, et al. The Receptors, a Comprehensive Treatise. New York: Plenum Press; 1979.
    1. Bemis GW, Murcko MA. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 1996;39:2887–2893. - PubMed
    1. Bondensgaard K, et al. Recognition of privileged structures by G-protein coupled receptors. J. Med. Chem. 2004;47:888–899. - PubMed
    1. Boyce M, et al. A selective inhibitor of eIF2alpha dephosphorylation protects cells from ER stress. Science. 2005;307:935–939. - PubMed

Publication types