Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun;31(6):e4353.
doi: 10.1002/pro.4353.

AlphaFold2 fails to predict protein fold switching

Affiliations

AlphaFold2 fails to predict protein fold switching

Devlina Chakravarty et al. Protein Sci. 2022 Jun.

Abstract

AlphaFold2 has revolutionized protein structure prediction by leveraging sequence information to rapidly model protein folds with atomic-level accuracy. Nevertheless, previous work has shown that these predictions tend to be inaccurate for structurally heterogeneous proteins. To systematically assess factors that contribute to this inaccuracy, we tested AlphaFold2's performance on 98-fold-switching proteins, which assume at least two distinct-yet-stable secondary and tertiary structures. Topological similarities were quantified between five predicted and two experimentally determined structures of each fold-switching protein. Overall, 94% of AlphaFold2 predictions captured one experimentally determined conformation but not the other. Despite these biased results, AlphaFold2's estimated confidences were moderate-to-high for 74% of fold-switching residues, a result that contrasts with overall low confidences for intrinsically disordered proteins, which are also structurally heterogeneous. To investigate factors contributing to this disparity, we quantified sequence variation within the multiple sequence alignments used to generate AlphaFold2's predictions of fold-switching and intrinsically disordered proteins. Unlike intrinsically disordered regions, whose sequence alignments show low conservation, fold-switching regions had conservation rates statistically similar to canonical single-fold proteins. Furthermore, intrinsically disordered regions had systematically lower prediction confidences than either fold-switching or single-fold proteins, regardless of sequence conservation. AlphaFold2's high prediction confidences for fold switchers indicate that it uses sophisticated pattern recognition to search for one most probable conformer rather than protein biophysics to model a protein's structural ensemble. Thus, it is not surprising that its predictions often fail for proteins whose properties are not fully apparent from solved protein structures. Our results emphasize the need to look at protein structure as an ensemble and suggest that systematic examination of fold-switching sequences may reveal propensities for multiple stable secondary and tertiary structures.

Keywords: AlphaFold2; fold-switching; protein-folding; structural heterogeneity.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
AlphaFold2 fails to predict fold switching. 94% of AlphaFold2 predictions fall below the identity line (dashed line, (a), indicating bias toward one fold. All five models for each test case are shown. Predictions were clustered by the quality of their correspondence with experiment (good predictions with TM ≥0.8 for both conformations were colored teal; moderate, purple; poor, green). Furthermore, all five of AlphaFold2's best models corresponded to Fold1 for 81% of fold‐switching sequences. Lines labeled with Roman numerals identify the points on the graph whose experimentally determined structures are depicted in (b). Experimentally determined structures representing all three clusters are shown in (c) and compared with the top five AlphaFold2 predictions for their sequences. Experimentally determined fold switching regions are colored according to their cluster; predicted fold‐switching regions are black; single‐folding regions are gray. The PDBIDs, chains, length TM‐scores and RMSDs are as follows: (I) 4aanA/4aalA—341, 0.9/0.83, 0.5/5.2 Å (II) 5jyt_A/2qke_E—106/108, 0.7/0.4, 7.6/9.3 Å, (III) 6z4uA/7kdtB—97, 0.66/0.2, 5.0/14.3 Å.
FIGURE 2
FIGURE 2
Distributions of AlphaFold2 predictions, measured by per‐residue local distance difference test (pLDDT) scores, differ between fold‐switching (blue), single‐fold (gray), and intrinsically disordered (red) protein sequences. Lower pLDDT scores indicate lower prediction confidences. Thus, AlphaFold2 is generally less confident in its predictions of IDPs than fold‐switching or single‐folding proteins.
FIGURE 3
FIGURE 3
Evolutionary rates of IDPs, as indicated by Rate4Site grades (1 = rapid evolution; 9 = high conservation) differ between fold‐switching (blue), single‐fold (gray), and intrinsically disordered (red) protein sequences. Sequences fold‐switching and single‐fold proteins tend to be more conserved than IDP sequences.
FIGURE 4
FIGURE 4
The fraction of AlphaFold2 predictions with per‐residue local distance difference test (pLDDT) scores ≥70 increases as sequence conservation increases (a). Distributions of prediction confidences (quantified by pLDDT scores) are skewed lower for disordered proteins (red) than for single‐fold (gray) and fold‐switching proteins (blue). (b). Wider regions correspond to more populated prediction confidences. In both cases, conservation score was determined using Rate4Site; higher scores correspond to more conserved sequences. Gray/white backgrounds group protein regions with the same conservation score.

References

    1. Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with alphafold. Nature. 2021;596(7873):583–589. - PMC - PubMed
    1. Lupas AN, Pereira J, Alva V, Merino F, Coles M, Hartmann MD. The breakthrough in protein structure prediction. Biochem J. 2021;478(10):1885–1890. - PMC - PubMed
    1. Berman HM, Battistuz T, Bhat TN, et al. The protein data bank. Acta Crystallogr D Biol Crystallogr. 2002;58(Pt 6) No 1:899–907. - PubMed
    1. Rose GD. Reframing the protein folding problem: Entropy as organizer. Biochemistry. 2021;60(49):3753–3761. - PubMed
    1. David A, Islam S, Tankhilevich E, Sternberg MJE. The alphafold database of protein structures: A biologist's guide. J Mol Biol. 2021;434(2):167336. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources