Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Nov 30;101(48):16721-6.
doi: 10.1073/pnas.0404719101. Epub 2004 Nov 17.

Compound library development guided by protein structure similarity clustering and natural product structure

Affiliations

Compound library development guided by protein structure similarity clustering and natural product structure

Marcus A Koch et al. Proc Natl Acad Sci U S A. .

Abstract

To identify biologically relevant and drug-like protein ligands for medicinal chemistry and chemical biology research the grouping of proteins according to evolutionary relationships and conservation of molecular recognition is an established method. We propose to employ structure similarity clustering of the ligand-sensing cores of protein domains (PSSC) in conjunction with natural product guided compound library development as a synergistic approach for the identification of biologically prevalidated ligands with high fidelity. This is supported by the concepts that (i) in nature spatial structure is more conserved than amino acid sequence, (ii) the number of fold types characteristic for all protein domains is limited, and (iii) the underlying frameworks of natural product classes with multiple biological activities provide evolutionarily selected starting points in structural space. On the basis of domain core similarity considerations and irrespective of sequence similarity, Cdc25A phosphatase, acetylcholinesterase, and 11beta-hydroxysteroid dehydrogenases type 1 and type 2 were grouped into a similarity cluster. A 147-member compound collection derived from the naturally occurring Cdc25A inhibitor dysidiolide yielded potent and selective inhibitors of the other members of the similarity cluster with a hit rate of 2-3%. Protein structure similarity clustering may provide an experimental opportunity to identify supersites in proteins.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Database search strategy. Dali/Fold Classification Based on Structure–Structure Alignment of Proteins (FSSP) searches or Combinatorial Extension (CE) searches are performed by using the Protein Data Bank (PDB) code of the protein of interest. Alternatively, coordinates of a query protein structure may be submitted, and Dali/FSSP or CE compares them against those in the PDB. The FSSP and CE databases are based on exhaustive all-against-all 3D structure comparison of protein structures currently included in the PDB. Hits are listed with decreasing similarity level (3D and sequence similarity). From this list proteins belonging to pharmaceutically relevant families/superfamilies with low sequence identity (SI up to 20%) are chosen and visually inspected. The relevant part of the protein with respect to the delineated concept, i.e., the catalytic core, the conserved part of the domain where the active site is located, must be structurally similar and superimposed. RMSD, rms deviation for aligned Cα positions. When protein structures become too large most superimposition algorithms might fail so that smaller subsets of such big domains containing the interesting catalytic core have to be superimposed. Here, according to our experience, the CE algorithm (see Supporting Materials and Methods) delivers the best results.
Fig. 2.
Fig. 2.
Synthesis of 2- and 3-substituted furans and γ-hydroxybutenolides (A) and 5-substituted butenolides and bisbutenolides (B). LDA, lithium diisopropylamide; THF, tetrahydrofuran; IBX, 1-hydroxy-1,2-benziodoxol-3(1H)-one 1-oxide; r.t., room temperature.
Fig. 3.
Fig. 3.
Superimposed catalytic cores of AChE (blue) and Cdc25A (red). The best matching parts were aligned with an rms deviation of 2.74 Å at an alignment length of 49 residues. The sequence identities in this best matching part amount to 8.2%. Also shown, in Corey–Pauling–Koltun (CPK) representation, are the catalytic residues, Ser-200 and Cys-430. They share the same location.
Fig. 4.
Fig. 4.
Superimposed catalytic cores of Cdc25A (red), 11βHSD1 (dark green), and 11βHSD2 (light green). Cdc25A and the pharmaceutically relevant 11βHSD1 exhibit an rms deviation of 4.13 Å at an alignment length of 80 residues. The sequence identities in this part amount to 5.0%. Also shown, in CPK representation, are the catalytic residues, Cys-430 (Cdc25A), Tyr-183 (11βHSD1), and Tyr-232 (11βHSD2). They share the same space although they derive from different locations when Cdc25A and 11βHSDs are compared.
Fig. 5.
Fig. 5.
Top view of the catalytic sites of Cdc25A (red), 11βHSD1 (green), and AChE (blue). The key catalytic residues, Cys-430 (Cdc25A), Tyr-183 (11βHSD1), and Ser-200 (AChE), shown in CPK representation, are located similarly.

Similar articles

Cited by

References

    1. Breinbauer, R., Vetter, I. R. & Waldmann, H. (2002) Angew. Chem. Int. Ed. 41, 2878–2890. - PubMed
    1. Koch, M. A., Breinbauer, R. & Waldmann, H. (2003) Biol. Chem. 384, 1265–1272. - PubMed
    1. Walters, W. P., Ajay & Murcko, M. A. (1999) Curr. Opin. Chem. Biol. 3, 384–387. - PubMed
    1. Ajay, Walters, W. P. & Murcko, M. A. (1998) J. Med. Chem. 41, 3314–3324. - PubMed
    1. Sadowski, J. & Kubinyi, H. (1998) J. Med. Chem. 41, 3325–3329. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources