Applying and improving AlphaFold at CASP14

John Jumper¹, Richard Evans¹, Alexander Pritzel¹, Tim Green¹, Michael Figurnov¹, Olaf Ronneberger¹, Kathryn Tunyasuvunakool¹, Russ Bates¹, Augustin Žídek¹, Anna Potapenko¹, Alex Bridgland¹, Clemens Meyer¹, Simon A A Kohl¹, Andrew J Ballard¹, Andrew Cowie¹, Bernardino Romera-Paredes¹, Stanislav Nikolov¹, Rishub Jain¹, Jonas Adler¹, Trevor Back¹, Stig Petersen¹, David Reiman¹, Ellen Clancy¹, Michal Zielinski¹, Martin Steinegger^{2

3}, Michalina Pacholska¹, Tamas Berghammer¹, David Silver¹, Oriol Vinyals¹, Andrew W Senior¹, Koray Kavukcuoglu¹, Pushmeet Kohli¹, Demis Hassabis¹

Affiliations

¹ DeepMind, London, UK.
² School of Biological Sciences, Seoul National University, Seoul, South Korea.
³ Artificial Intelligence Institute, Seoul National University, Seoul, South Korea.

PMID: 34599769
PMCID: PMC9299164
DOI: 10.1002/prot.26257

Applying and improving AlphaFold at CASP14

John Jumper et al. Proteins. 2021 Dec.

. 2021 Dec;89(12):1711-1721.

doi: 10.1002/prot.26257.

Authors

Affiliations

¹ DeepMind, London, UK.
² School of Biological Sciences, Seoul National University, Seoul, South Korea.
³ Artificial Intelligence Institute, Seoul National University, Seoul, South Korea.

PMID: 34599769
PMCID: PMC9299164
DOI: 10.1002/prot.26257

Abstract

We describe the operation and improvement of AlphaFold, the system that was entered by the team AlphaFold2 to the "human" category in the 14th Critical Assessment of Protein Structure Prediction (CASP14). The AlphaFold system entered in CASP14 is entirely different to the one entered in CASP13. It used a novel end-to-end deep neural network trained to produce protein structures from amino acid sequence, multiple sequence alignments, and homologous proteins. In the assessors' ranking by summed z scores (>2.0), AlphaFold scored 244.0 compared to 90.8 by the next best group. The predictions made by AlphaFold had a median domain GDT_TS of 92.4; this is the first time that this level of average accuracy has been achieved during CASP, especially on the more difficult Free Modeling targets, and represents a significant improvement in the state of the art in protein structure prediction. We reported how AlphaFold was run as a human team during CASP14 and improved such that it now achieves an equivalent level of performance without intervention, opening the door to highly accurate large-scale structure prediction.

Keywords: AlphaFold; CASP; deep learning; machine learning; protein structure prediction.

PubMed Disclaimer

Conflict of interest statement

John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Russ Bates, Alex Bridgland, Simon A. A. Kohl, David Reiman, and Andrew W. Senior have filed provisional patent applications relating to machine learning for predicting protein structures. The remaining authors declare no competing interests.

Figures

**FIGURE 1**
Examples of visualizations used in the prediction checking Colaboratory notebook, shown with CASP target T1101

**FIGURE 2**
T1024: (A) Per‐residue lDDT‐Cα and pLDDT of T1024. Vertical gray shading indicates residues missing in the experimental structure, and colored shading indicates minimum and maximum values over five predictions. The pLDDT shows low confidence in the linker region indicating possible flexibility and qualitatively agrees with the true per‐residue lDDT‐Cα. (B) Unrealized distances in the expected distances of T1024 indicating possible alternate relative conformations of the two domains

**FIGURE 3**
T1044: Comparison of the number of effective alignments (Neff) per residue for each MSA, derived both from domain sequences and from cropping the full sequence MSA. Four domains (T1033, T1039, T1040, and T1043) substantially benefit from using the full sequence MSA. The dashed green line shows the approximate 30 alignment threshold considered sufficient for a good prediction with AlphaFold

**FIGURE 4**
Comparison of three different prediction methods for the targets with significant interventions: “Original system” is the automated prediction system as it existed at target release. “Submitted prediction” is the submitted structure prediction. “Final system” is the automated system as it existed at the end of the CASP14 assessment, improved by experience

**FIGURE 5**
All models produced for T1064–pLDDT versus final lDDT‐Cα. The strong correlation indicates that ranking many predictions by pLDDT was a successful strategy for this target

See this image and copyright information in PMC

References

1. Kryshtafovych A, Moult J, Billings WM, et al. Modeling SARS‐CoV2 proteins in the CASP‐commons experiment. In review (2021). - PMC - PubMed
1. Senior AW, Evans R, Jumper J, et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020;577:706‐710. - PubMed
1. Senior AW, Evans R, Jumper J, et al. Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13). Proteins: Struct Funct Bioinf. 2019;87:1141‐1148. - PMC - PubMed
1. Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583‐589. - PMC - PubMed
1. Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)—round XIV. In review (2021). - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Applying and improving AlphaFold at CASP14

Affiliations

Applying and improving AlphaFold at CASP14

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources