. 2018 Mar;86 Suppl 1(Suppl 1):387-398.

doi: 10.1002/prot.25431. Epub 2017 Dec 17.

Continuous Automated Model EvaluatiOn (CAMEO) complementing the critical assessment of structure prediction in CASP12

Jürgen Haas^{1

2}, Alessandro Barbato^{1

2}, Dario Behringer^{1

2}, Gabriel Studer^{1

2}, Steven Roth^{1

2}, Martino Bertoni^{1

2}, Khaled Mostaguir^{1

2}, Rafal Gumienny^{1

2}, Torsten Schwede^{1

2}

Affiliations

¹ Biozentrum, University of Basel, Switzerland.
² SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland.

PMID: 29178137
PMCID: PMC5820194
DOI: 10.1002/prot.25431

Continuous Automated Model EvaluatiOn (CAMEO) complementing the critical assessment of structure prediction in CASP12

Jürgen Haas et al. Proteins. 2018 Mar.

. 2018 Mar;86 Suppl 1(Suppl 1):387-398.

doi: 10.1002/prot.25431. Epub 2017 Dec 17.

Authors

Affiliations

¹ Biozentrum, University of Basel, Switzerland.
² SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland.

PMID: 29178137
PMCID: PMC5820194
DOI: 10.1002/prot.25431

Abstract

Every second year, the community experiment "Critical Assessment of Techniques for Structure Prediction" (CASP) is conducting an independent blind assessment of structure prediction methods, providing a framework for comparing the performance of different approaches and discussing the latest developments in the field. Yet, developers of automated computational modeling methods clearly benefit from more frequent evaluations based on larger sets of data. The "Continuous Automated Model EvaluatiOn (CAMEO)" platform complements the CASP experiment by conducting fully automated blind prediction assessments based on the weekly pre-release of sequences of those structures, which are going to be published in the next release of the PDB Protein Data Bank. CAMEO publishes weekly benchmarking results based on models collected during a 4-day prediction window, on average assessing ca. 100 targets during a time frame of 5 weeks. CAMEO benchmarking data is generated consistently for all participating methods at the same point in time, enabling developers to benchmark and cross-validate their method's performance, and directly refer to the benchmarking results in publications. In order to facilitate server development and promote shorter release cycles, CAMEO sends weekly email with submission statistics and low performance warnings. Many participants of CASP have successfully employed CAMEO when preparing their methods for upcoming community experiments. CAMEO offers a variety of scores to allow benchmarking diverse aspects of structure prediction methods. By introducing new scoring schemes, CAMEO facilitates new development in areas of active research, for example, modeling quaternary structure, complexes, or ligand binding sites.

Keywords: CAMEO; CASP; benchmarking; continuous evaluation; ligand binding site accuracy; model confidence; model quality assessment; oligomeric assessment; protein structure modeling; protein structure prediction.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest

None.

Figures

**FIGURE 1**
Illustration of the lDDT-BS analysis for target 2016-07-30_00000063_1 (Myroilysin, PDB ID 5CZW, black cartoon). For the evaluation a reference residue set is created based on any residue in a 3 Å radius from the Zinc ion (light green sphere). The lDDT is calculated based on a 10 Å inclusion radius (grey sphere, grey sticks). Zinc coordinating residues CYS23, HIS137, HIS141 and HIS147 are shown as yellow sticks. Residues of the structure predictions matching the reference set are displayed both in ribbon and sticks in orange for SWISS-MODEL, in blue IntFOLD4-TS and in magenta Sparks-X. All predictions reproduced the Histidine residues with little variation from the reference structure, while they showed a much greater deviation for Cysteine 23.

**FIGURE 2**
Model confidence ROC plot based on pooling all residues from all predictions across a 3-months timeframe matching the CASP12 prediction season, applying a classification threshold of 60 lDDT. All public servers at the time are shown.

**FIGURE 3**
Panel A - ROC analysis of the residue-wise error estimates of the public methods active in CAMEO during 2016-05-01 - 2016-07-30. Historically well-performing tools such as Verify3D, Prosa and Dfire are outperformed by newer methods. Panel B - public methods currently available in CAMEO. New methods are constantly emerging, such as QMEANDisCo, eQuant2[53] and ModFOLD6, with QMEANDisCo currently being in narrow lead over ModFOLD6. The insets show the lDDT distribution of the underlying 3D models serving as targets for the QE category.

**FIGURE 4**
Illustration of the limitations of superposition based Cα distances in estimating model accuracy prediction. Panel A - Cα distances are compared to the corresponding local lDDT values for three exemplary quality predictions from CASP12. Cases of high Cα deviations contrasting high quality assigned by the all-atom lDDT values are indicated as red data points at the top right of the graph, and short Cα distances that contrast with low lDDT values are indicated as yellow and orange data points in the lower left area of the plot. The following examples illustrate reasons for these discrepancies: Panel B - the underlying global superposition fails for large domain movements and multi-domain proteins (red circles in panel A). Panel C – 90% of the atomic interactions are missing by focusing on Cα atoms, limiting in particular the assessment of high quality models. Panel D - evaluation of residue neighborhoods are implicitly excluded when considering Cα atoms only. lDDT assigns low scores to residues with large stereochemical deviations and physically impossible close contacts (e.g. yellow data points at lDDT value of 0.0, translating to unphysically positioned backbone and side chain atoms. The background image in panel A represents the data for all QE-stage2 submissions in CASP12.

See this image and copyright information in PMC

References

1. Rocklin GJ, et al. Global analysis of protein folding using massively parallel design, synthesis, and testing. Science. 2017;357(6347):168–175. - PMC - PubMed
1. Ucarli C, et al. Genetic diversity at the Dhn3 locus in Turkish Hordeum spontaneum populations with comparative structural analyses. Scientific reports. 2016;6:20966. - PMC - PubMed
1. Fernandez-Martinez J, et al. Structure and Function of the Nuclear Pore Complex Cytoplasmic mRNA Export Platform. Cell. 2016;167(5):1215–1228 e25. - PMC - PubMed
1. Barba-Spaeth G, et al. Structural basis of potent Zika-dengue virus antibody cross-neutralization. Nature. 2016;536(7614):48–53. - PubMed
1. Schwede T, et al. Outcome of a workshop on applications of protein models in biomedical research. Structure. 2009;17(2):151–9. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

U01 GM093324/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Continuous Automated Model EvaluatiOn (CAMEO) complementing the critical assessment of structure prediction in CASP12

Affiliations

Continuous Automated Model EvaluatiOn (CAMEO) complementing the critical assessment of structure prediction in CASP12

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases