Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec;89(12):1607-1617.
doi: 10.1002/prot.26237. Epub 2021 Oct 7.

Critical assessment of methods of protein structure prediction (CASP)-Round XIV

Affiliations

Critical assessment of methods of protein structure prediction (CASP)-Round XIV

Andriy Kryshtafovych et al. Proteins. 2021 Dec.

Abstract

Critical assessment of structure prediction (CASP) is a community experiment to advance methods of computing three-dimensional protein structure from amino acid sequence. Core components are rigorous blind testing of methods and evaluation of the results by independent assessors. In the most recent experiment (CASP14), deep-learning methods from one research group consistently delivered computed structures rivaling the corresponding experimental ones in accuracy. In this sense, the results represent a solution to the classical protein-folding problem, at least for single proteins. The models have already been shown to be capable of providing solutions for problematic crystal structures, and there are broad implications for the rest of structural biology. Other research groups also substantially improved performance. Here, we describe these results and outline some of the many implications. Other related areas of CASP, including modeling of protein complexes, structure refinement, estimation of model accuracy, and prediction of inter-residue contacts and distances, are also described.

Keywords: Alphafold; CASP; community wide experiment; protein folding; protein structure prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare they have no conflicts of interest.

Figures

Figure 1:
Figure 1:
Trend lines of backbone agreement with experiment for the best models in each of the 14 CASP rounds. Individual target points are shown for the most recent round. The three targets with the lowest agreement with experiment are colored blue (T1027 and T1029, NMR) and red (T0147s1, a subunit of a cryo-EM-derived heteromeric structure with complex inter-subunit interactions). The agreement metric, GDT_TS, is a multi-scale indicator of the closeness of the Cα atoms in a model to those in the corresponding experimental structure. Target difficulty is based on sequence and structure similarity to other proteins with known experimental structures. Performance in CASP14 (top black line) is very impressive, with accuracy approaching and in some cases likely exceeding experimental accuracy for many targets (see later text).
Figure 2:
Figure 2:
Example of a high accuracy CASP14 model - CASP target T1053, a two-domain bacterial kinase. Model (from AlphaFold2, GDT_TS 93) in magenta, experimental structure (PDB 7m7a, resolution 3.2 Å) in turquoise. Both domains are difficult modeling targets (FM/TBM category).
Figure 3:
Figure 3:
Superposition of a model (from AlphaFold2) of SARS CoV-2 ORF8 (CASP target T1064) and the corresponding experimental structure (PDB 7jtl, resolution 2.0 Å), illustrating the atomic level of agreement with experiment typically found in CASP14.
Figure 4:
Figure 4:
(A) Average agreement of the best CASP14 models with experiment (GDT_TS) for different categories of experimental data. The first three bins show a fall-off as the resolution of X-ray structures decreases, suggesting lower GDT_TS values are partly due to higher experimental error. The effect is most pronounced for Cryo-EM experimental structures (right hand bin, resolution range 2.2 – 3.8 Angstroms). Two of the three NMR targets (not included here) have very low GDT_TS values (see text). (B) Distribution of experimental data type across categories of target difficulty. The large majority of targets in the most difficult category (FM) have low resolution X-ray, Cryo-EM (resolution range 2.2 – 3.8 Angstroms), or NMR data, whereas in the easiest category, 90% of targets are determined from higher resolution X-ray data.
Figure 5:
Figure 5:
(A) Backbone agreement with experiment (GDT_TS) versus fraction of targets reaching a given level of agreement with experiment in different modeling difficulty categories. Trend lines for targets with the strongest homologous structural information available (‘TBM-easy’) are green, those where homology modeling is more difficult (‘TBM-Hard’) blue, those with only remote structural homologies (‘FM/TBM’) red, and the most difficult set with no detectable homology to known structures (‘FM’) black. Best models for each target. Targets with more information on homologous structures tend to be more accurate, but interpretation of that is complicated (see text). (B) Backbone agreement with experiment (Cα atom Root Mean Square Deviation) for different modeling difficulty categories (best models for each target).
Figure 6:
Figure 6:
Best model backbone agreement with experiment (GDT_TS) as a function of log normalized sequence alignment depth (Neff/len) for targets with no detectable homology to known structures (‘Free Modeling’ (‘FM’)). Data for the most recent three CASPs. For this subset of the hardest ‘FM’ targets, dependence on alignment depth seen in earlier CASPs is not seen in CASP14.

References

    1. Pereira J, Simpkin AJ, Hartmann MD, Rigden DJ, Keegan RM, Lupas AN. High-accuracy protein structure prediction in CASP14. Proteins 2021. doi: 10.1002/prot.26171. Online ahead of print. - DOI - PubMed
    1. Kinch LN, Pei J, Kryshtafovych A, Schaeffer RD, Grishin NV. Topology evaluation of models for difficult targets in the 14th round of the critical assessment of protein structure prediction. Proteins 2021. doi: 10.1002/prot.26172. Online ahead of print. - DOI - PMC - PubMed
    1. Ozden B, Kryshtafovych A, Karaca E. Assessment of the CASP14 Assembly Predictions. Proteins 2021. doi: 10.1002/prot.26199. Online ahead of print. - DOI - PMC - PubMed
    1. Lensink MF, Brysbaert G, Mauri T, Nadzirin N, Velankar S, Chaleil RAG, Clarence T, Bates PA, Kong R, Liu B, Yang G, Liu M, Shi H, Lu X, Chang S, Roy RS, Quadir F, Liu J, Cheng J, Antoniak A, Czaplewski C, Gieldo NA, Kogut M, Lipska AG, Liwo A, Lubecka EA, Maszota-Zieleniak M, Sieradzan AK, Slusarz R, Wesolowski PA, ZiEba K, Del Carpio Munoz CA, Ichiishi E, Harmalkar A, Gray JJ, Bonvin A, Ambrosetti F, Honorato RV, Jandova Z, Jimenez-Garcia B, Koukos PI, Van Keulen S, Van Noort CW, Reau M, Roel-Touris J, Kotelnikov S, Padhorny D, Porter KA, Alekseenko A, Ignatov M, Desta I, Ashizawa R, Sun Z, Ghani U, Hashemi N, Vajda S, Kozakov D, Rosell M, Rodriguez-Lumbreras LA, Fernandez-Recio J, Karczynska A, Grudinin S, Yan Y, Li H, Lin P, Huang SY, Christoffer C, Terashi G, Verburgt J, Sarkar D, Aderinwale T, Wang X, Kihara D, Nakamura T, Hanazono Y, Gowthaman R, Guest JD, Yin R, Taherzadeh G, Pierce BG, Barradas-Bautista D, Cao Z, Cavallo L, Oliva R, Sun Y, Zhu S, Shen Y, Park T, Woo H, Yang J, Kwon S, Won J, Seok C, Kiyota Y, Kobayashi S, Harada Y, Takeda-Shitaka M, Kundrotas PJ, Singh A, Vakser IA, DapkUnas J, Olechnovic K, Venclovas C, Duan R, Qiu L, Zhang S, Zou X, Wodak SJ. Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment. Proteins 2021. doi: 10.1002/prot.26222. Online ahead of print. - DOI - PMC - PubMed
    1. Ruiz-Serra V et al. Assessing the accuracy of contact and distance predictions in CASP14. Proteins 2021. [CASP14 special issue]. - PMC - PubMed

Publication types