Defining the fold space of membrane proteins: the CAMPS database
- PMID: 16802318
- DOI: 10.1002/prot.21081
Defining the fold space of membrane proteins: the CAMPS database
Abstract
Recent progress in structure determination techniques has led to a significant growth in the number of known membrane protein structures, and the first structural genomics projects focusing on membrane proteins have been initiated, warranting an investigation of appropriate bioinformatics strategies for optimal structural target selection for these molecules. What determines a membrane protein fold? How many membrane structures need to be solved to provide sufficient structural coverage of the membrane protein sequence space? We present the CAMPS database (Computational Analysis of the Membrane Protein Space) containing almost 45,000 proteins with three or more predicted transmembrane helices (TMH) from 120 bacterial species. This large set of membrane proteins was subjected to single-linkage clustering using only sequence alignments covering at least 40% of the TMH present in a given family. This process yielded 266 sequence clusters with at least 15 members, roughly corresponding to membrane structural folds, sufficiently structurally homogeneous in terms of the variation of TMH number between individual sequences. These clusters were further subdivided into functionally homogeneous subclusters according to the COG (Clusters of Orthologous Groups) system as well as more stringently defined families sharing at least 30% identity. The CAMPS sequence clusters are thus designed to reflect three main levels of interest for structural genomics: fold, function, and modeling distance. We present a library of Hidden Markov Models (HMM) derived from sequence alignments of TMH at these three levels of sequence similarity. Given that 24 out of 266 clusters corresponding to membrane folds already have associated known structures, we estimate that 242 additional new structures, one for each remaining cluster, would provide structural coverage at the fold level of roughly 70% of prokaryotic membrane proteins belonging to the currently most populated families.
(c) 2006 Wiley-Liss, Inc.
Similar articles
-
Camps 2.0: exploring the sequence and structure space of prokaryotic, eukaryotic, and viral membrane proteins.Proteins. 2012 Mar;80(3):839-57. doi: 10.1002/prot.23242. Epub 2011 Dec 28. Proteins. 2012. PMID: 22213543
-
PASS2: an automated database of protein alignments organised as structural superfamilies.BMC Bioinformatics. 2004 Apr 2;5:35. doi: 10.1186/1471-2105-5-35. BMC Bioinformatics. 2004. PMID: 15059245 Free PMC article.
-
Targeting novel folds for structural genomics.Proteins. 2002 Jul 1;48(1):44-52. doi: 10.1002/prot.10129. Proteins. 2002. PMID: 12012336
-
Target selection for structural genomics: an overview.Methods Mol Biol. 2008;426:3-25. doi: 10.1007/978-1-60327-058-8_1. Methods Mol Biol. 2008. PMID: 18542854 Review.
-
The impact of structural genomics: expectations and outcomes.Science. 2006 Jan 20;311(5759):347-51. doi: 10.1126/science.1121018. Science. 2006. PMID: 16424331 Review.
Cited by
-
Accurate multiple sequence alignment of transmembrane proteins with PSI-Coffee.BMC Bioinformatics. 2012 Mar 28;13 Suppl 4(Suppl 4):S1. doi: 10.1186/1471-2105-13-S4-S1. BMC Bioinformatics. 2012. PMID: 22536955 Free PMC article.
-
Classification of α-helical membrane proteins using predicted helix architectures.PLoS One. 2013 Oct 25;8(10):e77491. doi: 10.1371/journal.pone.0077491. eCollection 2013. PLoS One. 2013. PMID: 24204844 Free PMC article.
-
How many 3D structures do we need to train a predictor?Genomics Proteomics Bioinformatics. 2009 Sep;7(3):128-37. doi: 10.1016/S1672-0229(08)60041-8. Genomics Proteomics Bioinformatics. 2009. PMID: 19944385 Free PMC article.
-
A survey of integral alpha-helical membrane proteins.J Struct Funct Genomics. 2009 Dec;10(4):269-80. doi: 10.1007/s10969-009-9069-8. Epub 2009 Sep 17. J Struct Funct Genomics. 2009. PMID: 19760129 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources