Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar;86 Suppl 1(Suppl 1):7-15.
doi: 10.1002/prot.25415. Epub 2017 Dec 15.

Critical assessment of methods of protein structure prediction (CASP)-Round XII

Affiliations

Critical assessment of methods of protein structure prediction (CASP)-Round XII

John Moult et al. Proteins. 2018 Mar.

Abstract

This article reports the outcome of the 12th round of Critical Assessment of Structure Prediction (CASP12), held in 2016. CASP is a community experiment to determine the state of the art in modeling protein structure from amino acid sequence. Participants are provided sequence information and in turn provide protein structure models and related information. Analysis of the submitted structures by independent assessors provides a comprehensive picture of the capabilities of current methods, and allows progress to be identified. This was again an exciting round of CASP, with significant advances in 4 areas: (i) The use of new methods for predicting three-dimensional contacts led to a two-fold improvement in contact accuracy. (ii) As a consequence, model accuracy for proteins where no template was available improved dramatically. (iii) Models based on a structural template showed overall improvement in accuracy. (iv) Methods for estimating the accuracy of a model continued to improve. CASP continued to develop new areas: (i) Assessing methods for building quaternary structure models, including an expansion of the collaboration between CASP and CAPRI. (ii) Modeling with the aid of experimental data was extended to include SAXS data, as well as again using chemical cross-linking information. (iii) A team of assessors evaluated the suitability of models for a range of applications, including mutation interpretation, analysis of ligand binding properties, and identification of interfaces. This article describes the experiment and summarizes the results. The rest of this special issue of PROTEINS contains papers describing CASP12 results and assessments in more detail.

Keywords: CASP; community wide experiment; protein structure prediction.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Contact prediction accuracy in CASPs 11 and 12 against effective alignment depth. As expected, accuracy increases with alignment depth, and for a number of CASP12 targets with deep alignments, precision is 100%. Best results on the set of free modeling targets are shown. Precision is for the most confidently predicted L/5 contacts separated by more than 23 residues in the sequence, where L is the target length. Neff is the number of diverse (less than 90% ID) homologous sequences covering at least 60% of the target with an E-score of 10−3 or better, retrieved by HHblits from the uniprot20 database.
Figure 2
Figure 2
Backbone accuracy (GDT_TS) of the best submitted models in the free modeling category for the three most recent CASPs, as a function of target length. Good performance for targets smaller than 100 residues mostly reflects earlier improvements in this category. In CASP10, no models longer than 100 residues had GDT_TS greater than 50. In CASP11, four crossed this threshold. In CASP12, half of the targets longer than 100 residues do so. (On the GDT_TS scale, 100 is perfect agreement with experiment, 20 – 30 is typically random, and structures with scores above 50 are largely topologically correct).
Figure 3
Figure 3
Relationship between highest backbone accuracy (GDT_TS) and highest contact prediction accuracy for free modeling targets in CASP12. Average structure accuracy doubles as contact accuracy increases, demonstrating that high accuracy is a consequence of the availability of largely correct contacts. (Precision is for the L/5 most confidently predicted contacts separated by at least 23 residues in the sequence, L is target length).
Figure 4
Figure 4
Superposition of the best model received for target T0866, the periplasmic domain of MIaD from E.coli (blue), with the corresponding experimental structure (turquoise, PDB 4cx8). There were no sequence detectable templates for this protein, and the outstandingly accurate model is largely due to successful prediction of a set of three-dimensional contacts.
Figure 5
Figure 5
Trend lines for best model backbone accuracy (by GDT_TS) in CASP5 (2002), CASP11, and the most recent CASP12, for the template based modeling targets (TBM and TBM/FM). By this measure, there was only modest improvement in 12 years between CASP5 and 11, but a substantial jump in the last two years. Points show the CASP11 and CASP12 best models for each target. The case of T0868 is discussed in the text and shown in figure 6. The ‘Target Difficulty’ rank of each target is based on its sequence and structure similarity to the closest template.
Figure 6
Figure 6
Example of accurate template based modeling for a relatively difficult target, T0868, a bacterial CdiA tRNase toxin. The experimental structure (PDB 5j4a) is shown as a cyan cartoon, with the best homologous template in red, the best server model in green, and the best overall model in blue. There are several obvious areas of improvement over the template, for example modeling of the top left helix, not present in the template, correction of the inter-helical relationship on the top right, and correct replacement of the long template hairpin at the bottom of the structure.
Figure 7
Figure 7
Trend lines for average error (GDT_TS units) in identifying the best model for CASP11 (red) and CASP12 (black) targets. Lower lines indicate smaller error and thus better performance. The results of the top 10 ‘single-model’ methods (solid lines) are significantly better in CASP12 (black) than in CASP11 (red). In CASP12, the accuracy of the best single-model methods (black line) is higher than that of clustering methods (black dashed line), while in CASP11 the accuracy of single-model methods (red solid line) was much worse than the accuracy of clustering methods (red dashed line).

References

    1. Kryshtafovych A, Monastyrskyy B, Fidelis K. CASP11 statistics and the prediction center evaluation system. Proteins. 2016;84(Suppl 1):15–19. - PMC - PubMed
    1. Kryshtafovych A, et al. Target highlights from the first post-PSICASP experiment (CASP12, May-August 2016) PROTEINS. 2017 CASP12 issue. - PMC - PubMed
    1. Kryshtafovych A, Fidelis K, Moult J. CASP10 results compared to those of previous CASP experiments. Proteins. 2014;82(Suppl 2):164–174. - PMC - PubMed
    1. Dal Peraro M, et al. CASP targets paper. PROTEINS. 2017 CASP12 issue.
    1. Haas J, Barbato A, Studer G, Behringer D, Roth S, Mostaguir K, Bertoni M, Schwede T. Continuous Automated Model Evaluation (CAMEO) Complementing the Critical Assessment of Techniques for Structure Prediction. PROTEINS. 2017 CASP12 issue. - PMC - PubMed

Publication types