Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jan 24;109(4):1170-5.
doi: 10.1073/pnas.1119684109. Epub 2012 Jan 11.

Classification of protein functional surfaces using structural characteristics

Affiliations

Classification of protein functional surfaces using structural characteristics

Yan Yuan Tseng et al. Proc Natl Acad Sci U S A. .

Abstract

Protein structure and function are closely related, especially in functional surfaces, which are local spatial regions that perform the biological functions. Also, protein structures tend to evolve more slowly than amino acid sequences. We have therefore developed a method to classify proteins using the structures of functional surfaces; we call it protein surface classification (PSC). PSC may reflect functional relationships among proteins and may detect evolutionary relationships among highly divergent sequences. We focused on the surfaces of ligand-bound regions because they represent well-defined structures. Specifically, we used structural attributes to measure similarities between binding surfaces and constructed a PSC library of ~2,000 binding surface types from the bound forms. Using flavin mononucleotide-binding proteins and glycosidases as examples, we show how the evolutionary position of an uncharacterized protein can be defined and its function inferred from the characterized members of the same surface subtype. We found that proteins with the same enzyme nomenclature may be divided into subtypes and that two proteins in the same CATH (Class, Architecture, Topology, Homologous superfamily) fold may belong to two different surface types. In conclusion, our approach complements the sequence-based and fold-domain classifications and has the advantage of associating the shape of a protein with its biological function. As an expandable library, PSC provides a resource of spatial patterns for studying the evolution of protein structure and function.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Oxidoreductase classification and functional inference of the oxidoreductases with no Enzyme Commission annotation. The evolutionary position of an uncharacterized protein is potentially identified when its functional surface matches a known surface subtype.
Fig. 2.
Fig. 2.
Topology of the four major surface subtypes of glycosidase, containing 48, 36, 15, and 44 members, respectively. The representatives of subtypes A, B, C, and D are PDB1u33 (EC 3.2.1.1), PDB1ukt (EC 2.4.1.19), PDB2d2o (EC 3.2.1.135), and PDB1kxh (EC 3.2.1.1), respectively. Each member of a subtype is associated with an EC label, if available. Subtype D has a variety of members with mixed EC labels, whereas subtypes A, B, and C consist of members with EC annotations.
Fig. 3.
Fig. 3.
Structural conservation and divergence of binding surfaces of glycosidase. Binding surface g is the center of the surface type, which contains 143 members that are clustered by surface attributes into A, B, C, and D subtypes. Subtype A (18 aa) has a solvent-accessible area of 269.53 Å2 and a molecular volume of 401.27 Å3; subtypes B (26 aa), C (20 aa), and D (17 aa) have, respectively, solvent-accessible areas of 476.98, 261.61, and 270.00 Å2 and molecular volumes of 867.84, 475.16, and 410.24 Å3. Among them, subtype D can be further subclassified.
Fig. 4.
Fig. 4.
Prediction of the binding surfaces of unbound structures using the binding surfaces in the PSC database. (A) The surface of PDB2fgz is partitioned into 74 putative local surfaces; only three putative pockets are shown. (B) The predicted surface (colored pink) is identified by an fPOP match to the binding surface of PDB2e8z with an rmsd of 1.95 Å. (C) The alignment of two pocket sequences has a sequence identity of 48.3%, much higher than the full-length sequence identity of 29.5%. Among the 29 aligned spatial pocket residues, there are 13 highly conserved residues. In particular, the six active sites of H340, R404, D406, E435, W437, and D525 on the template of PDB2e8z perfectly match those of the query of PDB2fgz. (D) Three of the 10 putative local surfaces are shown for PDB3qnm. As a query, the predicted binding surface (colored pink) with 36 spatial pocket residues of PDB3qnm is matched with the binding surface of PDB3i76 (P ≤ 10−7) (E). (F) Eleven of the 31 aligned pocket residues are highly conserved, with a Tanimoto coefficient of 0.93.
Fig. 5.
Fig. 5.
FMN-binding surfaces across protein superfamilies with different folds. (A) The binding surface of PDB1al7 (350 aa) of Spinacia oleracea has key residues (colored violet): Y24, Y129, D157, H254, and R257. These are located on a typical oxidoreductase (EC 1.1.3.15) fold of Aldolase class I (CATH 3.20.20.70). (B) The identified transferase-binding surface on PDB2vbv (134 aa) of Methanococcus jannaschii is a riboflavin kinase (EC 2.7.1.161) with a CATH fold of 2.40.30.30. (C) Thermus DNA lyase (EC 4.1.99.3) has a complicated fold pattern: CATH 3.40.50.620, 1.25.40.80, and 1.10.579.10. These folds contain unique key residues (colored violet) on the local surface of PDB2j09 of Thermus thermophilus: W257, W328, and W351. (D) The binding surface on PDB2zru (356 aa) of Sulfolobus shibatae has both isomerase (EC 5.3.3.2) and oxidoreductase activities with the same fold as in A.

Similar articles

Cited by

References

    1. Bateman A, et al. The Pfam protein families database. Nucleic Acids Res. 2004;32(Database issue):D138–D141. - PMC - PubMed
    1. Orengo CA, et al. CATH—A hierarchic classification of protein domain structures. Structure. 1997;5:1093–1108. - PubMed
    1. Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995;247:536–540. - PubMed
    1. Tatusov RL, et al. The COG database: New developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001;29(1):22–28. - PMC - PubMed
    1. Hou J, Sims GE, Zhang C, Kim SH. A global representation of the protein fold space. Proc Natl Acad Sci USA. 2003;100:2386–2390. - PMC - PubMed

Publication types