. 2009 May 20:9:35.

doi: 10.1186/1472-6807-9-35.

QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information

Pascal Benkert¹, Torsten Schwede, Silvio Ce Tosatto

Affiliations

PMID: 19457232
PMCID: PMC2709111
DOI: 10.1186/1472-6807-9-35

QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information

Pascal Benkert et al. BMC Struct Biol. 2009.

. 2009 May 20:9:35.

doi: 10.1186/1472-6807-9-35.

Authors

Pascal Benkert¹, Torsten Schwede, Silvio Ce Tosatto

Affiliation

¹ Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056 Basel, Switzerland. pascal.benkert@unibas.ch

PMID: 19457232
PMCID: PMC2709111
DOI: 10.1186/1472-6807-9-35

Abstract

Background: The selection of the most accurate protein model from a set of alternatives is a crucial step in protein structure prediction both in template-based and ab initio approaches. Scoring functions have been developed which can either return a quality estimate for a single model or derive a score from the information contained in the ensemble of models for a given sequence. Local structural features occurring more frequently in the ensemble have a greater probability of being correct. Within the context of the CASP experiment, these so called consensus methods have been shown to perform considerably better in selecting good candidate models, but tend to fail if the best models are far from the dominant structural cluster. In this paper we show that model selection can be improved if both approaches are combined by pre-filtering the models used during the calculation of the structural consensus.

Results: Our recently published QMEAN composite scoring function has been improved by including an all-atom interaction potential term. The preliminary model ranking based on the new QMEAN score is used to select a subset of reliable models against which the structural consensus score is calculated. This scoring function called QMEANclust achieves a correlation coefficient of predicted quality score and GDT_TS of 0.9 averaged over the 98 CASP7 targets and perform significantly better in selecting good models from the ensemble of server models than any other groups participating in the quality estimation category of CASP7. Both scoring functions are also benchmarked on the MOULDER test set consisting of 20 target proteins each with 300 alternatives models generated by MODELLER. QMEAN outperforms all other tested scoring functions operating on individual models, while the consensus method QMEANclust only works properly on decoy sets containing a certain fraction of near-native conformations. We also present a local version of QMEAN for the per-residue estimation of model quality (QMEANlocal) and compare it to a new local consensus-based approach.

Conclusion: Improved model selection is obtained by using a composite scoring function operating on single models in order to enrich higher quality models which are subsequently used to calculate the structural consensus. The performance of consensus-based methods such as QMEANclust highly depends on the composition and quality of the model ensemble to be analysed. Therefore, performance estimates for consensus methods based on large meta-datasets (e.g. CASP) might overrate their applicability in more realistic modelling situations with smaller sets of models based on individual methods.

PubMed Disclaimer

Figures

**Figure 1**
**Analysis of the statistical significance based on a one-sided paired t-test (95% confidence level)**. Green: Method denoted on the horizontal performs significantly better. Red: Method denoted on the horizontal performs significantly worse. a) Pearson's correlation coefficient, b) Spearman's rank correlation coefficient, c) GDT_TS values of the models selected model by a scoring function.

**Figure 2**
**Comparison of QMEAN, a 3d-Jury like approach and QMEANclust on 3 selected CASP7 targets**. The table shows the GDT_TS difference between the best select model by QMEANclust and the 3D-jury approach. Correlations between predicted score and GDT_TS of three targets are shown for QMEAN, 3D-jury and QMEANclust (from left to right). The dashed areas mark the models selected by QMEAN as the basis for QMEANclust. The arrow on the right of each plot denotes the best selected model.

**Figure 3**
**Receiver operator characteristic (ROC) curves for the different local QMEAN versions and ProQres**. A Cα distance cut-off of 2.5 Å has been used. Two alternative QMEANclust approaches have been tested which combine the local Cα distances using median or weighted mean.

See this image and copyright information in PMC

Cited by

Regulation of the PI3K pathway through a p85α monomer-homodimer equilibrium.
Cheung LW, Walkiewicz KW, Besong TM, Guo H, Hawke DH, Arold ST, Mills GB. Cheung LW, et al. Elife. 2015 Jul 29;4:e06866. doi: 10.7554/eLife.06866. Elife. 2015. PMID: 26222500 Free PMC article.
Determination of pentapeptide repeat units in Qnr proteins by the structure-based alignment approach.
Park KS, Lee JH, Jeong DU, Lee JJ, Wu X, Jeong BC, Kang CM, Lee SH. Park KS, et al. Antimicrob Agents Chemother. 2011 Sep;55(9):4475-8. doi: 10.1128/AAC.00041-11. Epub 2011 Jun 27. Antimicrob Agents Chemother. 2011. PMID: 21709088 Free PMC article.
hCALCRL mutation causes autosomal recessive nonimmune hydrops fetalis with lymphatic dysplasia.
Mackie DI, Al Mutairi F, Davis RB, Kechele DO, Nielsen NR, Snyder JC, Caron MG, Kliman HJ, Berg JS, Simms J, Poyner DR, Caron KM. Mackie DI, et al. J Exp Med. 2018 Sep 3;215(9):2339-2353. doi: 10.1084/jem.20180528. Epub 2018 Aug 16. J Exp Med. 2018. PMID: 30115739 Free PMC article.
Homology-Based Modeling of Universal Stress Protein from Listeria innocua Up-Regulated under Acid Stress Conditions.
Tremonte P, Succi M, Coppola R, Sorrentino E, Tipaldi L, Picariello G, Pannella G, Fraternali F. Tremonte P, et al. Front Microbiol. 2016 Dec 20;7:1998. doi: 10.3389/fmicb.2016.01998. eCollection 2016. Front Microbiol. 2016. PMID: 28066336 Free PMC article.
Prediction of a new class of RNA recognition motif.
Cerdà-Costa N, Bonet J, Fernández MR, Avilés FX, Oliva B, Villegas S. Cerdà-Costa N, et al. J Mol Model. 2011 Aug;17(8):1863-75. doi: 10.1007/s00894-010-0888-0. Epub 2010 Nov 17. J Mol Model. 2011. PMID: 21082207

See all "Cited by" articles

References

1. Simons KT, Kooperberg C, Huang E, Baker D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol. 1997;268:209–225. doi: 10.1006/jmbi.1997.0959. - DOI - PubMed
1. Zhang Y, Arakaki AK, Skolnick J. TASSER: An automated method for the prediction of protein tertiary structures in CASP6. Proteins: Structure, Function, and Bioinformatics. 2005;61:91–98. doi: 10.1002/prot.20724. - DOI - PubMed
1. Sommer I, Toppo S, Sander O, Lengauer T, Tosatto SC. Improving the quality of protein structure models by selecting from alignment alternatives. BMC Bioinformatics. 2006;7:364. doi: 10.1186/1471-2105-7-364. - DOI - PMC - PubMed
1. Saqi MA, Bates PA, Sternberg MJ. Towards an automatic method of predicting protein structure by homology: an evaluation of suboptimal sequence alignments. Protein Eng. 1992;5:305–311. doi: 10.1093/protein/5.4.305. - DOI - PubMed
1. Cheng J. A multi-template combination algorithm for protein comparative modeling. BMC Struct Biol. 2008;8:18. doi: 10.1186/1472-6807-8-18. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information

Affiliation

QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources