Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 8;120(32):e2303499120.
doi: 10.1073/pnas.2303499120. Epub 2023 Jul 31.

The transformative power of transformers in protein structure prediction

Affiliations

The transformative power of transformers in protein structure prediction

Bernard Moussad et al. Proc Natl Acad Sci U S A. .

Abstract

Transformer neural networks have revolutionized structural biology with the ability to predict protein structures at unprecedented high accuracy. Here, we report the predictive modeling performance of the state-of-the-art protein structure prediction methods built on transformers for 69 protein targets from the recently concluded 15th Critical Assessment of Structure Prediction (CASP15) challenge. Our study shows the power of transformers in protein structure modeling and highlights future areas of improvement.

Keywords: deep learning; neural networks; protein structure prediction; transformers.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
Performance benchmarking of the state-of-the-art protein structure prediction methods on the CASP15 dataset. (A) GDT-TS score comparisons. The dashed lines represent the mean performance with the percentages reported in the Top Left indicating that the method in the y axis outperforms the method in the x axis, and vice versa for the Bottom Right. (B) Grishin plot analysis. (C) Domain-level GDT-TS scores against the length of the domains with the inset showing the domains having lengths less than 750 residues. The lines represent linear fit to the data. (D) Three representative CASP15 targets with the predictions (in rainbow) superimposed on the experimental structures (in gray). Bold numbers indicate the best performance. (E) Correct overall topology prediction performance in terms of %TM-score >0.5. (F) lDDT scores of MSA-based methods against the MSA depth measured by the logarithm of Neff. The lines represent linear fit to the data. (G) lDDT scores against the length of the protein targets with the inset showing the targets having lengths less than 750 residues. The lines represent linear fit to the data. (H) Ramachandran plot analysis. The color code ramps from blue to red for low to high density. (I) MolProbity score distributions. (J) GDC-SC score comparisons similar to A.

References

    1. Jumper J., et al. , Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). - PMC - PubMed
    1. Pereira J., et al. , High-accuracy protein structure prediction in CASP14. Proteins: Struct. Funct. Bioinf. 89, 1687–1699 (2021). - PubMed
    1. Varadi M., et al. , AlphaFold protein structure database: Massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 50, D439–D444 (2022). - PMC - PubMed
    1. Wu R., et al. , High-resolution de novo structure prediction from primary sequence. BioRxiv [Preprint] (2022). 10.1101/2022.07.21.500999 (Accessed 3 January 2023). - DOI
    1. Baek M., et al. , Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021). - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources