The protein structure prediction problem could be solved using the current PDB library
- PMID: 15653774
- PMCID: PMC545829
- DOI: 10.1073/pnas.0407152101
The protein structure prediction problem could be solved using the current PDB library
Abstract
For single-domain proteins, we examine the completeness of the structures in the current Protein Data Bank (PDB) library for use in full-length model construction of unknown sequences. To address this issue, we employ a comprehensive benchmark set of 1,489 medium-size proteins that cover the PDB at the level of 35% sequence identity and identify templates by structure alignment. With homologous proteins excluded, we can always find similar folds to native with an average rms deviation (RMSD) from native of 2.5 A with approximately 82% alignment coverage. These template structures often contain a significant number of insertions/deletions. The tasser algorithm was applied to build full-length models, where continuous fragments are excised from the top-scoring templates and reassembled under the guide of an optimized force field, which includes consensus restraints taken from the templates and knowledge-based statistical potentials. For almost all targets (except for 2/1,489), the resultant full-length models have an RMSD to native below 6 A (97% of them below 4 A). On average, the RMSD of full-length models is 2.25 A, with aligned regions improved from 2.5 A to 1.88 A, comparable with the accuracy of low-resolution experimental structures. Furthermore, starting from state-of-the-art structural alignments, we demonstrate a methodology that can consistently bring template-based alignments closer to native. These results are highly suggestive that the protein-folding problem can in principle be solved based on the current PDB library by developing efficient fold recognition algorithms that can recover such initial alignments.
Figures







Similar articles
-
Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm.Proteins. 2004 Aug 15;56(3):502-18. doi: 10.1002/prot.20106. Proteins. 2004. PMID: 15229883
-
Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins.Biophys J. 2004 Oct;87(4):2647-55. doi: 10.1529/biophysj.104.045385. Biophys J. 2004. PMID: 15454459 Free PMC article.
-
Automated structure prediction of weakly homologous proteins on a genomic scale.Proc Natl Acad Sci U S A. 2004 May 18;101(20):7594-9. doi: 10.1073/pnas.0305695101. Epub 2004 May 4. Proc Natl Acad Sci U S A. 2004. PMID: 15126668 Free PMC article.
-
A guide to template based structure prediction.Curr Protein Pept Sci. 2009 Jun;10(3):270-85. doi: 10.2174/138920309788452182. Curr Protein Pept Sci. 2009. PMID: 19519455 Review.
-
Progress and challenges in protein structure prediction.Curr Opin Struct Biol. 2008 Jun;18(3):342-8. doi: 10.1016/j.sbi.2008.02.004. Epub 2008 Apr 22. Curr Opin Struct Biol. 2008. PMID: 18436442 Free PMC article. Review.
Cited by
-
Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10.Proteins. 2014 Feb;82 Suppl 2(0 2):175-87. doi: 10.1002/prot.24341. Epub 2013 Aug 31. Proteins. 2014. PMID: 23760925 Free PMC article.
-
Deep-learning contact-map guided protein structure prediction in CASP13.Proteins. 2019 Dec;87(12):1149-1164. doi: 10.1002/prot.25792. Epub 2019 Aug 14. Proteins. 2019. PMID: 31365149 Free PMC article.
-
TASSER_low-zsc: an approach to improve structure prediction using low z-score-ranked templates.Proteins. 2010 Oct;78(13):2769-80. doi: 10.1002/prot.22791. Proteins. 2010. PMID: 20635423 Free PMC article.
-
Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis.BMC Struct Biol. 2009 Apr 17;9:23. doi: 10.1186/1472-6807-9-23. BMC Struct Biol. 2009. PMID: 19374763 Free PMC article.
-
Characterization of mutant serine palmitoyltransferase 1 in LY-B cells.Lipids. 2009 Aug;44(8):725-32. doi: 10.1007/s11745-009-3316-4. Epub 2009 Jun 18. Lipids. 2009. PMID: 19536577 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources