Increased detection of structural templates using alignments of designed sequences
- PMID: 12696050
- DOI: 10.1002/prot.10346
Increased detection of structural templates using alignments of designed sequences
Abstract
Protein structure prediction by comparative modeling benefits greatly from the use of multiple sequence alignment information to improve the accuracy of structural template identification and the alignment of target sequences to structural templates. Unfortunately, this benefit is limited to those protein sequences for which at least several natural sequence homologues exist. We show here that the use of large diverse alignments of computationally designed protein sequences confers many of the same benefits as natural sequences in identifying structural templates for comparative modeling targets. A large-scale massively parallelized application of an all-atom protein design algorithm, including a simple model of peptide backbone flexibility, has allowed us to generate 500 diverse, non-native, high-quality sequences for each of 264 protein structures in our test set. PSI-BLAST searches using the sequence profiles generated from the designed sequences ("reverse" BLAST searches) give near-perfect accuracy in identifying true structural homologues of the parent structure, with 54% coverage. In 41 of 49 genomes scanned using reverse BLAST searches, at least one novel structural template (not found by the standard method of PSI-BLAST against PDB) is identified. Further improvements in coverage, through optimizing the scoring function used to design sequences and continued application to new protein structures beyond the test set, will allow this method to mature into a useful strategy for identifying distantly related structural templates.
Copyright 2003 Wiley-Liss, Inc.
Similar articles
-
Homology-based modeling of 3D structures of protein-protein complexes using alignments of modified sequence profiles.Int J Biol Macromol. 2008 Aug 15;43(2):198-208. doi: 10.1016/j.ijbiomac.2008.05.004. Epub 2008 May 21. Int J Biol Macromol. 2008. PMID: 18572239
-
Beyond the Twilight Zone: automated prediction of structural properties of proteins by recursive neural networks and remote homology information.Proteins. 2009 Oct;77(1):181-90. doi: 10.1002/prot.22429. Proteins. 2009. PMID: 19422056
-
Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm.Proteins. 2004 Aug 15;56(3):502-18. doi: 10.1002/prot.20106. Proteins. 2004. PMID: 15229883
-
Sequence comparison and protein structure prediction.Curr Opin Struct Biol. 2006 Jun;16(3):374-84. doi: 10.1016/j.sbi.2006.05.006. Epub 2006 May 19. Curr Opin Struct Biol. 2006. PMID: 16713709 Review.
-
Protein structure modeling for structural genomics.Nat Struct Biol. 2000 Nov;7 Suppl:986-90. doi: 10.1038/80776. Nat Struct Biol. 2000. PMID: 11104007 Review.
Cited by
-
Emergence of protein fold families through rational design.PLoS Comput Biol. 2006 Jul 7;2(7):e85. doi: 10.1371/journal.pcbi.0020085. Epub 2006 May 26. PLoS Comput Biol. 2006. PMID: 16839198 Free PMC article.
-
A de novo redesign of the WW domain.Protein Sci. 2003 Oct;12(10):2194-205. doi: 10.1110/ps.03190903. Protein Sci. 2003. PMID: 14500877 Free PMC article.
-
Recognition of beta-structural motifs using hidden Markov models trained with simulated evolution.Bioinformatics. 2010 Jun 15;26(12):i287-93. doi: 10.1093/bioinformatics/btq199. Bioinformatics. 2010. PMID: 20529918 Free PMC article.
-
Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition.PLoS One. 2010 May 5;5(5):e10410. doi: 10.1371/journal.pone.0010410. PLoS One. 2010. PMID: 20463972 Free PMC article.
-
Testing the Coulomb/Accessible Surface Area solvent model for protein stability, ligand binding, and protein design.BMC Bioinformatics. 2008 Mar 13;9:148. doi: 10.1186/1471-2105-9-148. BMC Bioinformatics. 2008. PMID: 18366628 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Research Materials
