Evolutionary information for specifying a protein fold
- PMID: 16177782
- DOI: 10.1038/nature03991
Evolutionary information for specifying a protein fold
Abstract
Classical studies show that for many proteins, the information required for specifying the tertiary structure is contained in the amino acid sequence. Here, we attempt to define the sequence rules for specifying a protein fold by computationally creating artificial protein sequences using only statistical information encoded in a multiple sequence alignment and no tertiary structure information. Experimental testing of libraries of artificial WW domain sequences shows that a simple statistical energy function capturing coevolution between amino acid residues is necessary and sufficient to specify sequences that fold into native structures. The artificial proteins show thermodynamic stabilities similar to natural WW domains, and structure determination of one artificial protein shows excellent agreement with the WW fold at atomic resolution. The relative simplicity of the information used for creating sequences suggests a marked reduction to the potential complexity of the protein-folding problem.
Comment in
-
Structural biology: form and function instructions.Nature. 2005 Sep 22;437(7058):486-7. doi: 10.1038/437486a. Nature. 2005. PMID: 16177774 No abstract available.
Similar articles
-
Natural-like function in artificial WW domains.Nature. 2005 Sep 22;437(7058):579-83. doi: 10.1038/nature03990. Nature. 2005. PMID: 16177795
-
Thermodynamic propensities of amino acids in the native state ensemble: implications for fold recognition.Protein Sci. 2001 May;10(5):1032-45. doi: 10.1110/ps.01601. Protein Sci. 2001. PMID: 11316884 Free PMC article.
-
Capturing protein sequence-structure specificity using computational sequence design.Proteins. 2013 Sep;81(9):1556-70. doi: 10.1002/prot.24307. Epub 2013 Jun 20. Proteins. 2013. PMID: 23609941
-
Fold recognition methods.Methods Biochem Anal. 2003;44:525-46. doi: 10.1002/0471721204.ch26. Methods Biochem Anal. 2003. PMID: 12647403 Review.
-
Finding the needle in the haystack: towards solving the protein-folding problem computationally.Crit Rev Biochem Mol Biol. 2018 Feb;53(1):1-28. doi: 10.1080/10409238.2017.1380596. Epub 2017 Oct 4. Crit Rev Biochem Mol Biol. 2018. PMID: 28976219 Free PMC article. Review.
Cited by
-
Generative power of a protein language model trained on multiple sequence alignments.Elife. 2023 Feb 3;12:e79854. doi: 10.7554/eLife.79854. Elife. 2023. PMID: 36734516 Free PMC article.
-
Chromosomal periodicity of evolutionarily conserved gene pairs.Proc Natl Acad Sci U S A. 2007 Jun 19;104(25):10559-64. doi: 10.1073/pnas.0610776104. Epub 2007 Jun 11. Proc Natl Acad Sci U S A. 2007. PMID: 17563360 Free PMC article.
-
Influence of hPin1 WW N-terminal domain boundaries on function, protein stability, and folding.Protein Sci. 2007 Jul;16(7):1495-501. doi: 10.1110/ps.072775507. Protein Sci. 2007. PMID: 17586778 Free PMC article.
-
Maximum entropy models for antibody diversity.Proc Natl Acad Sci U S A. 2010 Mar 23;107(12):5405-10. doi: 10.1073/pnas.1001705107. Epub 2010 Mar 8. Proc Natl Acad Sci U S A. 2010. PMID: 20212159 Free PMC article.
-
Engineering gain-of-function mutants of a WW domain by dynamics and structural analysis.Protein Sci. 2023 Sep;32(9):e4759. doi: 10.1002/pro.4759. Protein Sci. 2023. PMID: 37574787 Free PMC article.
Publication types
MeSH terms
Substances
Associated data
- Actions
LinkOut - more resources
Full Text Sources
Other Literature Sources