Identification of recurring protein structure microenvironments and discovery of novel functional sites around CYS residues
- PMID: 20122268
- PMCID: PMC2833161
- DOI: 10.1186/1472-6807-10-4
Identification of recurring protein structure microenvironments and discovery of novel functional sites around CYS residues
Abstract
Background: The emergence of structural genomics presents significant challenges in the annotation of biologically uncharacterized proteins. Unfortunately, our ability to analyze these proteins is restricted by the limited catalog of known molecular functions and their associated 3D motifs.
Results: In order to identify novel 3D motifs that may be associated with molecular functions, we employ an unsupervised, two-phase clustering approach that combines k-means and hierarchical clustering with knowledge-informed cluster selection and annotation methods. We applied the approach to approximately 20,000 cysteine-based protein microenvironments (3D regions 7.5 A in radius) and identified 70 interesting clusters, some of which represent known motifs (e.g. metal binding and phosphatase activity), and some of which are novel, including several zinc binding sites. Detailed annotation results are available online for all 70 clusters at http://feature.stanford.edu/clustering/cys.
Conclusions: The use of microenvironments instead of backbone geometric criteria enables flexible exploration of protein function space, and detection of recurring motifs that are discontinuous in sequence and diverse in structure. Clustering microenvironments may thus help to functionally characterize novel proteins and better understand the protein structure-function relationship.
Figures









Similar articles
-
Clustering protein environments for function prediction: finding PROSITE motifs in 3D.BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S10. doi: 10.1186/1471-2105-8-S4-S10. BMC Bioinformatics. 2007. PMID: 17570144 Free PMC article.
-
Identification of subfamily-specific sites based on active sites modeling and clustering.Bioinformatics. 2010 Dec 15;26(24):3075-82. doi: 10.1093/bioinformatics/btq595. Epub 2010 Oct 26. Bioinformatics. 2010. PMID: 20980272
-
Structural fragment clustering reveals novel structural and functional motifs in alpha-helical transmembrane proteins.BMC Bioinformatics. 2010 Apr 26;11:204. doi: 10.1186/1471-2105-11-204. BMC Bioinformatics. 2010. PMID: 20420672 Free PMC article.
-
Target selection and determination of function in structural genomics.IUBMB Life. 2003 Apr-May;55(4-5):249-55. doi: 10.1080/1521654031000123385. IUBMB Life. 2003. PMID: 12880206 Free PMC article. Review.
-
Prediction of protein function from protein sequence and structure.Q Rev Biophys. 2003 Aug;36(3):307-40. doi: 10.1017/s0033583503003901. Q Rev Biophys. 2003. PMID: 15029827 Review.
Cited by
-
Prediction of functionally important residues in globular proteins from unusual central distances of amino acids.BMC Struct Biol. 2011 Sep 18;11:34. doi: 10.1186/1472-6807-11-34. BMC Struct Biol. 2011. PMID: 21923943 Free PMC article.
-
A deep learning framework to predict binding preference of RNA constituents on protein surface.Nat Commun. 2019 Oct 30;10(1):4941. doi: 10.1038/s41467-019-12920-0. Nat Commun. 2019. PMID: 31666519 Free PMC article.
-
Unsupervised learning reveals landscape of local structural motifs across protein classes.Bioinformatics. 2025 Jul 1;41(7):btaf377. doi: 10.1093/bioinformatics/btaf377. Bioinformatics. 2025. PMID: 40569048 Free PMC article.
-
An integrative computational framework based on a two-step random forest algorithm improves prediction of zinc-binding sites in proteins.PLoS One. 2012;7(11):e49716. doi: 10.1371/journal.pone.0049716. Epub 2012 Nov 14. PLoS One. 2012. PMID: 23166753 Free PMC article.
-
Mining the TRAF6/p62 interactome for a selective ubiquitination motif.BMC Proc. 2011 May 28;5 Suppl 2(Suppl 2):S4. doi: 10.1186/1753-6561-5-S2-S4. BMC Proc. 2011. PMID: 21554762 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources