Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction

doi:10.1186/1471-2105-7-410

. 2006 Sep 14:7:410.

doi: 10.1186/1471-2105-7-410.

Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction

Eric D Scheeff¹, Philip E Bourne

Affiliations

PMID: 16970830
PMCID: PMC1622756
DOI: 10.1186/1471-2105-7-410

Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction

Eric D Scheeff et al. BMC Bioinformatics. 2006.

. 2006 Sep 14:7:410.

doi: 10.1186/1471-2105-7-410.

Authors

Eric D Scheeff¹, Philip E Bourne

Affiliation

¹ San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093-0537, USA. scheeff@salk.edu

PMID: 16970830
PMCID: PMC1622756
DOI: 10.1186/1471-2105-7-410

Abstract

Background: One of the most powerful methods for the prediction of protein structure from sequence information alone is the iterative construction of profile-type models. Because profiles are built from sequence alignments, the sequences included in the alignment and the method used to align them will be important to the sensitivity of the resulting profile. The inclusion of highly diverse sequences will presumably produce a more powerful profile, but distantly related sequences can be difficult to align accurately using only sequence information. Therefore, it would be expected that the use of protein structure alignments to improve the selection and alignment of diverse sequence homologs might yield improved profiles. However, the actual utility of such an approach has remained unclear.

Results: We explored several iterative protocols for the generation of profile hidden Markov models. These protocols were tailored to allow the inclusion of protein structure alignments in the process, and were used for large-scale creation and benchmarking of structure alignment-enhanced models. We found that models using structure alignments did not provide an overall improvement over sequence-only models for superfamily-level structure predictions. However, the results also revealed that the structure alignment-enhanced models were complimentary to the sequence-only models, particularly at the edge of the "twilight zone". When the two sets of models were combined, they provided improved results over sequence-only models alone. In addition, we found that the beneficial effects of the structure alignment-enhanced models could not be realized if the structure-based alignments were replaced with sequence-based alignments. Our experiments with different iterative protocols for sequence-only models also suggested that simple protocol modifications were unable to yield equivalent improvements to those provided by the structure alignment-enhanced models. Finally, we found that models using structure alignments provided fold-level structure assignments that were superior to those produced by sequence-only models.

Conclusion: When attempting to predict the structure of remote homologs, we advocate a combined approach in which both traditional models and models incorporating structure alignments are used.

PubMed Disclaimer

Figures

**Figure 1**
Relative performance of single-master HMMs, SLAHMMs, and the combined models with differing iteration parameter sets (PS), presented as a coverage vs. theoretical errors per query (EPQ) plot. The different parameter sets are defined in Table 2 and explained in the text. Values for correct assignments are truncated at 600 in order to emphasize differences between the various methods (no method had an error below 600 correct assignments).

**Figure 2**
Relative performance of different types of HMMs in assignment of structure to sequence probes, presented as a coverage vs. error plot. SLAHMM-CW refers to models built in the same way as SLAHMMs, but using only sequence information to align the SCOP domains rather than a structural alignment (see text). Iterative parameters used for construction of all models were from PS1 (Table 2). Values for correct assignments are truncated at 600 in order to emphasize differences between the various methods (no method other than SLAHMM-CW had errors below 600 correct assignments).

**Figure 3**
Venn diagram describing coverage overlap of the three primary model sets from PS1, when using a strict cutoff of 80 incorrect assignments (theoretical EPQ ~0.05). The numbers shown in parentheses near each model type designation refer to the total number of correct matches made by that model type prior to the cutoff point. Identical matches of the same probes by some or all of the three different methods are provided by the numbers in the set diagram. The completely unique matches by single-master HMMs and SLAHMMs are color coded to match the circle for that model type. "All Models" denotes the assignments made by the combined database of SLAHMMs and single-master HMMs used together in a single search.

**Figure 4**
Relative performance of different types of HMMs in assignment of fold-level structure to sequence probes, presented as a coverage vs. error plot. Details of model types are provided in the text. Iterative parameters used for construction of all models were from PS1 (Table 2).

**Figure 5**
Comparison of HMMs built using an older protein sequence database for iterative construction ("old db") with those built using a current sequence database ("new db"), presented as a coverage vs. error plot. Results are colored similarly for corresponding model types, with the results based on the older database in a lighter color. A different version of the HMMER software was also used for the two result sets; details of model types and construction are provided in the text. Iterative parameters used for construction of all models were from PS1 (Table 2).

See this image and copyright information in PMC

Cited by

Refining homology models by combining replica-exchange molecular dynamics and statistical potentials.
Zhu J, Fan H, Periole X, Honig B, Mark AE. Zhu J, et al. Proteins. 2008 Sep;72(4):1171-88. doi: 10.1002/prot.22005. Proteins. 2008. PMID: 18338384 Free PMC article.
Comprehensive prediction of chromosome dimer resolution sites in bacterial genomes.
Kono N, Arakawa K, Tomita M. Kono N, et al. BMC Genomics. 2011 Jan 11;12:19. doi: 10.1186/1471-2164-12-19. BMC Genomics. 2011. PMID: 21223577 Free PMC article.
Hidden Markov model speed heuristic and iterative HMM search procedure.
Johnson LS, Eddy SR, Portugaly E. Johnson LS, et al. BMC Bioinformatics. 2010 Aug 18;11:431. doi: 10.1186/1471-2105-11-431. BMC Bioinformatics. 2010. PMID: 20718988 Free PMC article.
Detection and architecture of small heat shock protein monomers.
Poulain P, Gelly JC, Flatters D. Poulain P, et al. PLoS One. 2010 Apr 7;5(4):e9990. doi: 10.1371/journal.pone.0009990. PLoS One. 2010. PMID: 20383329 Free PMC article.

References

1. Fischer D, Eisenberg D. Predicting structures for genome proteins. Curr Opin Struct Biol. 1999;9:208–211. doi: 10.1016/S0959-440X(99)80029-3. - DOI - PubMed
1. Bork P, Dandekar T, Diaz-Lazcoz Y, Eisenhaber F, Huynen M, Yuan Y. Predicting function: from genes to genomes and back. J Mol Biol. 1998;283:707–725. doi: 10.1006/jmbi.1998.2144. - DOI - PubMed
1. Dietmann S, Fernandez-Fuentes N, Holm L. Automated detection of remote homology. Curr Opin Struct Biol. 2002;12:362–367. doi: 10.1016/S0959-440X(02)00332-9. - DOI - PubMed
1. Petrey D, Honig B. Protein structure prediction: inroads to biology. Mol Cell. 2005;20:811–819. doi: 10.1016/j.molcel.2005.12.005. - DOI - PubMed
1. Aloy P, Querol E, Aviles FX, Sternberg MJ. Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. J Mol Biol. 2001;311:395–408. doi: 10.1006/jmbi.2001.4870. - DOI - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources

[1] Fischer D, Eisenberg D. Predicting structures for genome proteins. Curr Opin Struct Biol. 1999;9:208–211. doi: 10.1016/S0959-440X(99)80029-3. - DOI - PubMed

[2] Fischer D, Eisenberg D. Predicting structures for genome proteins. Curr Opin Struct Biol. 1999;9:208–211. doi: 10.1016/S0959-440X(99)80029-3. - DOI - PubMed

[3] Bork P, Dandekar T, Diaz-Lazcoz Y, Eisenhaber F, Huynen M, Yuan Y. Predicting function: from genes to genomes and back. J Mol Biol. 1998;283:707–725. doi: 10.1006/jmbi.1998.2144. - DOI - PubMed

[4] Bork P, Dandekar T, Diaz-Lazcoz Y, Eisenhaber F, Huynen M, Yuan Y. Predicting function: from genes to genomes and back. J Mol Biol. 1998;283:707–725. doi: 10.1006/jmbi.1998.2144. - DOI - PubMed

[5] Dietmann S, Fernandez-Fuentes N, Holm L. Automated detection of remote homology. Curr Opin Struct Biol. 2002;12:362–367. doi: 10.1016/S0959-440X(02)00332-9. - DOI - PubMed

[6] Dietmann S, Fernandez-Fuentes N, Holm L. Automated detection of remote homology. Curr Opin Struct Biol. 2002;12:362–367. doi: 10.1016/S0959-440X(02)00332-9. - DOI - PubMed

[7] Petrey D, Honig B. Protein structure prediction: inroads to biology. Mol Cell. 2005;20:811–819. doi: 10.1016/j.molcel.2005.12.005. - DOI - PubMed

[8] Petrey D, Honig B. Protein structure prediction: inroads to biology. Mol Cell. 2005;20:811–819. doi: 10.1016/j.molcel.2005.12.005. - DOI - PubMed

[9] Aloy P, Querol E, Aviles FX, Sternberg MJ. Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. J Mol Biol. 2001;311:395–408. doi: 10.1006/jmbi.2001.4870. - DOI - PubMed

[10] Aloy P, Querol E, Aviles FX, Sternberg MJ. Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. J Mol Biol. 2001;311:395–408. doi: 10.1006/jmbi.2001.4870. - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction

Affiliation

Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources