Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 7;28(22):7462.
doi: 10.3390/molecules28227462.

AlphaFold Blindness to Topological Barriers Affects Its Ability to Correctly Predict Proteins' Topology

Affiliations

AlphaFold Blindness to Topological Barriers Affects Its Ability to Correctly Predict Proteins' Topology

Pawel Dabrowski-Tumanski et al. Molecules. .

Abstract

AlphaFold is a groundbreaking deep learning tool for protein structure prediction. It achieved remarkable accuracy in modeling many 3D structures while taking as the user input only the known amino acid sequence of proteins in question. Intriguingly though, in the early steps of each individual structure prediction procedure, AlphaFold does not respect topological barriers that, in real proteins, result from the reciprocal impermeability of polypeptide chains. This study aims to investigate how this failure to respect topological barriers affects AlphaFold predictions with respect to the topology of protein chains. We focus on such classes of proteins that, during their natural folding, reproducibly form the same knot type on their linear polypeptide chain, as revealed by their crystallographic analysis. We use partially artificial test constructs in which the mutual non-permeability of polypeptide chains should not permit the formation of complex composite knots during natural protein folding. We find that despite the formal impossibility that the protein folding process could produce such knots, AlphaFold predicts these proteins to form complex composite knots. Our study underscores the necessity for cautious interpretation and further validation of topological features in protein structures predicted by AlphaFold.

Keywords: AlphaFold; knotted proteins; overlapping residues; protein structure prediction; residue gas model; topological barriers; topology validation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Ribosome-based co-translational folding mechanism [38]. The mechanism allows the formation of a deeply knotted protein with two “plug” domains on both termini. (A) Part of the central domain chain (green) surrounds the ribosome exit tunnel, forming the twisted loop attached to the ribosome surface (gray). The loop is threaded by the nascent C-tail going out of the ribosome. The N-tail may freely fold into the N-terminal domain, e.g., bulky plug (violet part of the chain). (B) The chain is threaded through the loop and may start the formation of the C-terminal domain (violet part of the chain). (C) Only after the chain is fully formed and pushed through the twisted loop may it form the C-terminal bulky domain (violet chain in the background). In this state, the chain is already knotted with two fast-folding bulky domains formed. After detachment of the loop from the ribosome surface, the central domain may fold into the native structure. The arrows indicate the movement of the chain out of the ribosome exit channel.
Figure 2
Figure 2
Proposed mechanisms of knot formation. (A) Direct threading of the tail can produce up to two knots, most probably shallow. (B) On-ribosome knotting requires attaching the loop to the ribosome surface and therefore allows the formation of a single deep knot [38]. (C) The composition of threading and on-ribosome folding allows the creation of three consecutive knots (separated by dashed lines)—one deep knot in the center surrounded by up to two, most probably shallow knots formed by the termini. The colors in panels (B,C) show large and small subunits of a ribosome. Green is the protein chain. The arrows indicate the chain movement leading to a knot.
Figure 3
Figure 3
Structures of MJ0366 (PDB code 2efv) multimers. (A) Trimer with 3 consecutive knots, (B) pentamer with 5 consecutive knots, (C) decamer with 10 consecutive knots. In each panel, the domains are depicted in different colors. Black strands denote the glycine linkers.
Figure 4
Figure 4
The structures of multimers of deeply knotted YibK protein (PBD code 1j85). (A) Trimer with 3 consecutive trefoil knots, (B) pentamer with 5 consecutive trefoil knots. In both panels, the tandemly repeated protein blocks are represented with different pastel colors. The darker colors indicate the knotted core. The glycine linkers are presented in black.
Figure 5
Figure 5
Modified YibK protein with shortened loop. (A) The native (blue) structure of YibK overlayed with the structure with 40% of residues removed from the twisted loop (red). The structures differ only in the region of the modified loop. (B) The loop (green) with the threaded tail (yellow) with all the atoms explicitly shown. In the top left corner is shown a schematic depiction of the threading. The native twisted loop is delimited by green β-strands and spans indices Arg74-Phe99.
Figure 6
Figure 6
The predicted structures of tandem triple repeats colored by pLDDT. (A) 2efv trimer and (B) 1j85 trimer (right). Both structures have three consecutive trefoil knots. The lowest pLDDT (blue) can be seen in the linkers which are flexible and do not have homologs with well-defined structures. The knotted cores’ pLDDT is relatively high (red), indicating that those regions are modeled reliably. The white parts are the tails with medium values of pLDDT. Below the structures is the scale bar.
Figure 7
Figure 7
Wrong structure with overlayed domains. (A) The model proposed by AlphaFold for the 1j85 trimeric repeat ranked in second place. The terminal domains—red and green—are almost overlayed. (B) The plot of pLDDT for the structure. The dashed line denotes the mean pLDDT score, equal to 74. The colors in panel (A) match those in panel (B).

Similar articles

Cited by

References

    1. Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A., et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. - DOI - PMC - PubMed
    1. Senior A.W., Evans R., Jumper J., Kirkpatrick J., Sifre L., Green T., Qin C., Žídek A., Nelson A.W., Bridgland A., et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020;577:706–710. doi: 10.1038/s41586-019-1923-7. - DOI - PubMed
    1. Senior A.W., Evans R., Jumper J., Kirkpatrick J., Sifre L., Green T., Qin C., Žídek A., Nelson A.W., Bridgland A., et al. Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13) Proteins Struct. Funct. Bioinform. 2019;87:1141–1148. doi: 10.1002/prot.25834. - DOI - PMC - PubMed
    1. Kryshtafovych A., Schwede T., Topf M., Fidelis K., Moult J. Critical assessment of methods of protein structure prediction (CASP)—Round XIII. Proteins Struct. Funct. Bioinform. 2019;87:1011–1020. doi: 10.1002/prot.25823. - DOI - PMC - PubMed
    1. Kryshtafovych A., Schwede T., Topf M., Fidelis K., Moult J. Critical assessment of methods of protein structure prediction (CASP)—Round XIV. Proteins Struct. Funct. Bioinform. 2021;89:1607–1617. doi: 10.1002/prot.26237. - DOI - PMC - PubMed

LinkOut - more resources