. 2006 Jul 27:7:364.

doi: 10.1186/1471-2105-7-364.

Improving the quality of protein structure models by selecting from alignment alternatives

Ingolf Sommer¹, Stefano Toppo, Oliver Sander, Thomas Lengauer, Silvio C E Tosatto

Affiliations

Affiliation

¹ Department of Computational Biology and Applied Algorithmics, Max-Planck-lnstitute for Informatics, Stuhlsatzenhausweg 85, D-66123 Saarbrücken, Germany. sommer@mpi-sb.mpg.de

PMID: 16872519
PMCID: PMC1579234
DOI: 10.1186/1471-2105-7-364

Improving the quality of protein structure models by selecting from alignment alternatives

Ingolf Sommer et al. BMC Bioinformatics. 2006.

. 2006 Jul 27:7:364.

doi: 10.1186/1471-2105-7-364.

Authors

Ingolf Sommer¹, Stefano Toppo, Oliver Sander, Thomas Lengauer, Silvio C E Tosatto

Affiliation

¹ Department of Computational Biology and Applied Algorithmics, Max-Planck-lnstitute for Informatics, Stuhlsatzenhausweg 85, D-66123 Saarbrücken, Germany. sommer@mpi-sb.mpg.de

PMID: 16872519
PMCID: PMC1579234
DOI: 10.1186/1471-2105-7-364

Abstract

Background: In the area of protein structure prediction, recently a lot of effort has gone into the development of Model Quality Assessment Programs (MQAPs). MQAPs distinguish high quality protein structure models from inferior models. Here, we propose a new method to use an MQAP to improve the quality of models. With a given target sequence and template structure, we construct a number of different alignments and corresponding models for the sequence. The quality of these models is scored with an MQAP and used to choose the most promising model. An SVM-based selection scheme is suggested for combining MQAP partial potentials, in order to optimize for improved model selection.

Results: The approach has been tested on a representative set of proteins. The ability of the method to improve models was validated by comparing the MQAP-selected structures to the native structures with the model quality evaluation program TM-score. Using the SVM-based model selection, a significant increase in model quality is obtained (as shown with a Wilcoxon signed rank test yielding p-values below 10(-15)). The average increase in TMscore is 0.016, the maximum observed increase in TM-score is 0.29.

Conclusion: In template-based protein structure prediction alignment is known to be a bottleneck limiting the overall model quality. Here we show that a combination of systematic alignment variation and modern model scoring functions can significantly improve the quality of alignment-based models.

PubMed Disclaimer

Figures

**Figure 1**
Overview of model quality improvement with respect to the model difficulty. Left for PVS, right for PVH analogously. Each dot corresponds to a model where the x-coordinate is the TM-score of the corresponding default Arby model and the y-coordinate is the TM-score improvement with respect to this default model. Smoothed quantile lines are shown for the 10% (lower dashed), 50% (middle), 90% (upper dashed) quantiles of the models within a sliding window of size 0.15. Black lines represent all models, red lines represent the models selected using FRST, green lines represent the models selected using the SVM approach. For the smoothing evaluations are made at 1000 equidistant points and the resulting quantiles are smoothed with a lowess function (local linear scatter plot smoother). **Interpretation:** The TM-score of the Arby default gives an indication of how difficult it is to find the right template for a target. For the selection methods random, FRST, and SVM, this plot shows the potential improvement with respect to difficulty of the target. For PVH, more models are generated below default. For both PVS and PVH, the SVM selection performs better than FRST selection, and FRST performs better than random.

**Figure 2**
(Left) Average increase in TM-score, for ranges of difficulty. Targets are binned according to the TM-score of the default Arby model. Within each bin the average increase in quality *qim*is plotted. Bins are enumerated horizontally, the two outer bins were concatenated with their neighbors as each contained less than 100 target samples. Models are selected from *PVS* using the SVM. For comparison the average increase in quality obtained on this benchmark set by performing loop modeling is 0.003. (Right) Maximum increase in TM-score, for the same ranges of difficulty. The maximum increase in quality max *qim* within each bin is visualized as a line above the box representing the average increase (which is the same as on the left side, just the scale is different).

See this image and copyright information in PMC

Cited by

Effect of using suboptimal alignments in template-based protein structure prediction.
Chen H, Kihara D. Chen H, et al. Proteins. 2011 Jan;79(1):315-34. doi: 10.1002/prot.22885. Proteins. 2011. PMID: 21058297 Free PMC article.
QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information.
Benkert P, Schwede T, Tosatto SC. Benkert P, et al. BMC Struct Biol. 2009 May 20;9:35. doi: 10.1186/1472-6807-9-35. BMC Struct Biol. 2009. PMID: 19457232 Free PMC article.
A conditional neural fields model for protein threading.
Ma J, Peng J, Wang S, Xu J. Ma J, et al. Bioinformatics. 2012 Jun 15;28(12):i59-66. doi: 10.1093/bioinformatics/bts213. Bioinformatics. 2012. PMID: 22689779 Free PMC article.
Computational analysis of prolyl hydroxylase domain-containing protein 2 (PHD2) mutations promoting polycythemia insurgence in humans.
Minervini G, Quaglia F, Tosatto SC. Minervini G, et al. Sci Rep. 2016 Jan 12;6:18716. doi: 10.1038/srep18716. Sci Rep. 2016. PMID: 26754054 Free PMC article.
Protein structure homology modeling using SWISS-MODEL workspace.
Bordoli L, Kiefer F, Arnold K, Benkert P, Battey J, Schwede T. Bordoli L, et al. Nat Protoc. 2009;4(1):1-13. doi: 10.1038/nprot.2008.197. Nat Protoc. 2009. PMID: 19131951

See all "Cited by" articles

References

1. Moult J, Fidelis K, Tramontano A, Rost B, Hubbard T. Critical assessment of methods of protein structure prediction (CASP) – round VI. Proteins. 2005;61:3–7. doi: 10.1002/prot.20716. - DOI - PubMed
1. Rychlewski L, Jaroszewski L, Li W, Godzik A. Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Science. 2000;9:232–241. - PMC - PubMed
1. von Öhsen N, Zimmer R. Improving profile-profile alignment via log average scoring. In: Gascuel O, Moret B, editor. Algorithms in Bioinformatics, First International Workshop, WABI. Springer; 2001. pp. 11–26.
1. Yona G, Levitt M. Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. J Mol Biol. 2002;315:1257–1275. doi: 10.1006/jmbi.2001.5293. - DOI - PubMed
1. Wang G, Dunbrack RL. Scoring profile-to-profile sequence alignments. Protein Sci. 2004;13:1612–1626. doi: 10.1110/ps.03601504. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Improving the quality of protein structure models by selecting from alignment alternatives

Affiliation

Improving the quality of protein structure models by selecting from alignment alternatives

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources