Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Dec;87(12):1011-1020.
doi: 10.1002/prot.25823. Epub 2019 Oct 23.

Critical assessment of methods of protein structure prediction (CASP)-Round XIII

Affiliations

Critical assessment of methods of protein structure prediction (CASP)-Round XIII

Andriy Kryshtafovych et al. Proteins. 2019 Dec.

Abstract

CASP (critical assessment of structure prediction) assesses the state of the art in modeling protein structure from amino acid sequence. The most recent experiment (CASP13 held in 2018) saw dramatic progress in structure modeling without use of structural templates (historically "ab initio" modeling). Progress was driven by the successful application of deep learning techniques to predict inter-residue distances. In turn, these results drove dramatic improvements in three-dimensional structure accuracy: With the proviso that there are an adequate number of sequences known for the protein family, the new methods essentially solve the long-standing problem of predicting the fold topology of monomeric proteins. Further, the number of sequences required in the alignment has fallen substantially. There is also substantial improvement in the accuracy of template-based models. Other areas-model refinement, accuracy estimation, and the structure of protein assemblies-have again yielded interesting results. CASP13 placed increased emphasis on the use of sparse data together with modeling and chemical crosslinking, SAXS, and NMR all yielded more mature results. This paper summarizes the key outcomes of CASP13. The special issue of PROTEINS contains papers describing the CASP13 assessments in each modeling category and contributions from the participants.

Keywords: CASP; community wide experiment; protein structure prediction.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Trend lines of backbone accuracy for the best models in each of the 13 CASP experiments. Individual target points are shown for the two most recent experiments. The accuracy metric, GDT_TS, is a multiscale indicator of the closeness of the Cα atoms in a model to those in the corresponding experimental structure. Target difficulty is based on sequence and structure similarity to other proteins with known experimental structures (see for details). There is a striking improvement in model accuracy in CASP13 (top black line), particularly for the more difficult targets.
Figure 2:
Figure 2:
Best contact prediction precision in recent CASPs. CASPs 9 and 10 continued a long trend of low precision. CASP11 shows a small advance, while the two most recent, CASP12 and 13, show dramatic improvements. In CASPs 11 and 12 progress is the result of more sophisticated statistical models, together with largely conventional machine learning. The further jump in CASP13 is the result of the effective deployment of deep learning methods. (Average fraction of correctly predicted contacts for the most confidently predicted L/5 contacts 24 or more residues apart in the sequence, where L is target length. Free modeling targets, average for the best performing group in each CASP. Contacting residue pairs defined as those with less than 8 Angstroms between Cβ atoms).
Figure 3:
Figure 3:
Contact prediction precision trend lines as a function of sequence alignment depth and target length. In CASP13 there is a reduced dependency on alignment depth, resulting in more accurate results for shallow alignments as well as higher precision overall. Strikingly, for ten out of the 31 free modeling targets, the best predictions achieved 100% precision for this subset of contacts (see figure 2 for definitions). The effective alignment depth, Neff, includes metagenomic sequences compiled as described in Neff was calculated using a 90% sequence identity cutoff and a minimum of 60% sequence coverage (details in).
Figure 4:
Figure 4:
Crystal structure of a 354 residue domain of a free modeling target (T0969-D1), ESKIMO 1, a probable xylan acetyltransferase, PDB 6CCI (left panel) and the most accurate CASP model (right panel). Most of the structure core is modeled to a Cα accuracy of better than 1 (cyan) or 2 Angstroms (green). Irregular loop regions are less accurate (yellow, better than 4 Angstroms or orange, up to 8 Angstroms error. Some residues (red) in external loops have larger errors.
Figure 5:
Figure 5:
Best model main chain accuracy (GDT_TS) as a function of sequence alignment depth and target length for CASPs 12 and 13. Accuracy depends on alignment depth, as is expected if the result is dominated by contact prediction accuracy and related advances. Across all alignment depths, CASP13 models are on average more accurate than those in CASP12.
Figure 6:
Figure 6:
Best model backbone accuracy (GDT_TS) as a function of target difficult for template-based models in recent CASPs. CASP13 shows a marked improvement in accuracy compared to previous CASP. Targets are those where there is clear sequence relationship to a known structure (termed TBM) and those with a marginal relationship (TBM/FM).
Figure 7:
Figure 7:
Trend lines for the fraction of non-principal template (‘loop’) residues correctly modeled. There is a substantial improvement in CASP13. (Best models received for each target, 3.8 Angstrom Cα atom agreement or better considered correct, TBM and TBM/FM targets).
Figure 8:
Figure 8:
Part of the experimental structure of target H0953 (PDB 6F45), the adhesin tip complex of a bacteriophage tail fiber, illustrating subunit structure interdependence. One of the two protein chains contributing to this assembly forms a trimer (colored red, green and blue), with the N terminal five strand beta sheets of the three monomers packing against each other. The C terminal three beta strands of each monomer inter-digitate with each other. The C terminal stands also form an interface with the helical end of another subunit (green). Impressively, in spite of the apparent interdependency of the five-strand beta-sheets, accurate models were returned for that part of the structure. But failure to consider the even more intimate subunit interactions of the three N terminal strands resulted in incorrect models for that subdomain.

References

    1. Kinch LN, Kryshtafovych A, Monastyrskyy B, Grishin NV. CASP13 target classification into tertiary structure prediction categories. Proteins 2019. - PMC - PubMed
    1. Egelman EH. The Current Revolution in Cryo-EM. Biophys J 2016;110(5):1008–1012. - PMC - PubMed
    1. Kryshtafovych A, et al. Cryo-EM targets in CASP13: overview and evaluation of results. [CASP13 special issue] 2019. - PMC - PubMed
    1. Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)-Round XII. Proteins 2018;86 Suppl 1:7–15. - PMC - PubMed
    1. Kryshtafovych A, Fidelis K, Moult J. CASP10 results compared to those of previous CASP experiments. Proteins 2014;82 Suppl 2:164–174. - PMC - PubMed

Publication types