A unifold, mesofold, and superfold model of protein fold use
- PMID: 11746703
- DOI: 10.1002/prot.10011
A unifold, mesofold, and superfold model of protein fold use
Abstract
As more and more protein structures are determined, there is increasing interest in the question of how many different folds have been used in biology. The history of the rate of discovery of new folds and the distribution of sequence families among known folds provide a means of estimating the underlying distribution of fold use. Previous models exploiting these data have led to rather different conclusions on the total number of folds. We present a new model, based on the notion that the folds used in biology fall naturally into three classes: unifolds, that is, folds found only in a single narrow sequence family; mesofolds, found in an intermediate number of families; and the previously noted superfolds, found in many protein families. We show that this model fits the available data well and has predicted the development of SCOP over the past 2 years. The principle implications of the model are as follows: (1) The vast majority of folds will be found in only a single sequence family; (2) the total number of folds is at least 10,000; and (3) 80% of sequence families have one of about 400 folds, most of which are already known.
Copyright 2001 Wiley-Liss, Inc.
Similar articles
-
The number of protein folds and their distribution over families in nature.Proteins. 2004 Feb 15;54(3):491-9. doi: 10.1002/prot.10514. Proteins. 2004. PMID: 14747997
-
Estimating the total number of protein folds.Proteins. 1999 Jun 1;35(4):408-14. Proteins. 1999. PMID: 10382668
-
The Structural Rule Distinguishing a Superfold: A Case Study of Ferredoxin Fold and the Reverse Ferredoxin Fold.Molecules. 2022 May 31;27(11):3547. doi: 10.3390/molecules27113547. Molecules. 2022. PMID: 35684484 Free PMC article.
-
Protein folds, functions and evolution.J Mol Biol. 1999 Oct 22;293(2):333-42. doi: 10.1006/jmbi.1999.3054. J Mol Biol. 1999. PMID: 10529349 Review.
-
The family feud: do proteins with similar structures fold via the same pathway?Curr Opin Struct Biol. 2005 Feb;15(1):42-9. doi: 10.1016/j.sbi.2005.01.011. Curr Opin Struct Biol. 2005. PMID: 15718132 Review.
Cited by
-
The Biological Big Bang model for the major transitions in evolution.Biol Direct. 2007 Aug 20;2:21. doi: 10.1186/1745-6150-2-21. Biol Direct. 2007. PMID: 17708768 Free PMC article.
-
New biochemistry in the Rhodanese-phosphatase superfamily: emerging roles in diverse metabolic processes, nucleic acid modifications, and biological conflicts.NAR Genom Bioinform. 2023 Mar 23;5(1):lqad029. doi: 10.1093/nargab/lqad029. eCollection 2023 Mar. NAR Genom Bioinform. 2023. PMID: 36968430 Free PMC article.
-
Comparative genomics of ethanolamine utilization.J Bacteriol. 2009 Dec;191(23):7157-64. doi: 10.1128/JB.00838-09. Epub 2009 Sep 25. J Bacteriol. 2009. PMID: 19783625 Free PMC article.
-
Exploring dynamics of protein structure determination and homology-based prediction to estimate the number of superfamilies and folds.BMC Struct Biol. 2006 Mar 20;6:6. doi: 10.1186/1472-6807-6-6. BMC Struct Biol. 2006. PMID: 16549009 Free PMC article.
-
CATHEDRAL: a fast and effective algorithm to predict folds and domain boundaries from multidomain protein structures.PLoS Comput Biol. 2007 Nov;3(11):e232. doi: 10.1371/journal.pcbi.0030232. PLoS Comput Biol. 2007. PMID: 18052539 Free PMC article.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous