Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Feb 1;27(3):343-50.
doi: 10.1093/bioinformatics/btq662. Epub 2010 Dec 5.

Toward the estimation of the absolute quality of individual protein structure models

Affiliations

Toward the estimation of the absolute quality of individual protein structure models

Pascal Benkert et al. Bioinformatics. .

Abstract

Motivation: Quality assessment of protein structures is an important part of experimental structure validation and plays a crucial role in protein structure prediction, where the predicted models may contain substantial errors. Most current scoring functions are primarily designed to rank alternative models of the same sequence supporting model selection, whereas the prediction of the absolute quality of an individual protein model has received little attention in the field. However, reliable absolute quality estimates are crucial to assess the suitability of a model for specific biomedical applications.

Results: In this work, we present a new absolute measure for the quality of protein models, which provides an estimate of the 'degree of nativeness' of the structural features observed in a model and describes the likelihood that a given model is of comparable quality to experimental structures. Model quality estimates based on the QMEAN scoring function were normalized with respect to the number of interactions. The resulting scoring function is independent of the size of the protein and may therefore be used to assess both monomers and entire oligomeric assemblies. Model quality scores for individual models are then expressed as 'Z-scores' in comparison to scores obtained for high-resolution crystal structures. We demonstrate the ability of the newly introduced QMEAN Z-score to detect experimentally solved protein structures containing significant errors, as well as to evaluate theoretical protein models. In a comprehensive QMEAN Z-score analysis of all experimental structures in the PDB, membrane proteins accumulate on one side of the score spectrum and thermostable proteins on the other. Proteins from the thermophilic organism Thermatoga maritima received significantly higher QMEAN Z-scores in a pairwise comparison with their homologous mesophilic counterparts, underlining the significance of the QMEAN Z-score as an estimate of protein stability.

Availability: The Z-score calculation has been integrated in the QMEAN server available at: http://swissmodel.expasy.org/qmean.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Comparison between traditional (A) and normalized all-atom interaction score (B) on a non-redundant set of 9766 high-resolution PDB chains.
Fig. 2.
Fig. 2.
Normalized QMEAN score composed of four statistical potential terms (QMEAN4) of 9766 high-resolution structures. Red crosses indicate chains belonging to membrane proteins, blue crosses denote other proteins deviating by more than 3 standard deviations (see Supplementary Table S1 for details).
Fig. 3.
Fig. 3.
Oligomeric complex of mammalian actin (in grey) with toxofilin (chain T, blue) from toxoplasma gondii [PDB:2Q97; (Lee et al., 2007)]. In the complex toxofilin adopts a non-globular conformation, which is meaningless in isolation. As expected, the QMEAN Z-score of −3.3 for toxofilin in isolation is unfavourable (Table 2).
Fig. 4.
Fig. 4.
QMEAN scores for all structures in the biological unit reference set. Proteins with unusually high QMEAN scores (Z-score >3) marked in green correspond almost exclusively to proteins from thermophilic organisms (see also Supplementary Table S5).
Fig. 5.
Fig. 5.
Correlation between QMEAN and GDT_TS for all server models of CASP8. (A) Scatter plot, (B) boxplot.
Fig. 6.
Fig. 6.
Density plot visualizing the QMEAN Z-score distribution of theoretical protein structure models. Z-scores for models from CASP8 are shown in relation to scores of experimental reference structures (black line). The models are split into three quality ranges with low-quality models in red, medium-quality models in blue and good models in green.

References

    1. Dengue virus NS3 serine protease. Crystal structure and insights into interaction of the active site with substrates by molecular modeling and structural analysis of mutational effects. J. Biol. Chem. 2009;284:34468. - PMC - PubMed
    1. Altschul SF, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Arnold K, et al. The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling. Bioinformatics. 2006;22:195–201. - PubMed
    1. Bairoch A, et al. The Universal Protein Resource (UniProt) Nucleic Acids Res. 2005;33:D154–D159. - PMC - PubMed
    1. Baker D, Sali A. Protein structure prediction and structural genomics. Science. 2001;294:93–96. - PubMed

Publication types