Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 15;6(1):421.
doi: 10.1038/s42003-023-04773-7.

Does AlphaFold2 model proteins' intracellular conformations? An experimental test using cross-linking mass spectrometry of endogenous ciliary proteins

Affiliations

Does AlphaFold2 model proteins' intracellular conformations? An experimental test using cross-linking mass spectrometry of endogenous ciliary proteins

Caitlyn L McCafferty et al. Commun Biol. .

Abstract

A major goal in structural biology is to understand protein assemblies in their biologically relevant states. Here, we investigate whether AlphaFold2 structure predictions match native protein conformations. We chemically cross-linked proteins in situ within intact Tetrahymena thermophila cilia and native ciliary extracts, identifying 1,225 intramolecular cross-links within the 100 best-sampled proteins, providing a benchmark of distance restraints obeyed by proteins in their native assemblies. The corresponding structure predictions were highly concordant, positioning 86.2% of cross-linked residues within Cɑ-to-Cɑ distances of 30 Å, consistent with the cross-linker length. 43% of proteins showed no violations. Most inconsistencies occurred in low-confidence regions or between domains. Overall, AlphaFold2 predictions with lower predicted aligned error corresponded to more correct native structures. However, we observe cases where rigid body domains are oriented incorrectly, as for ciliary protein BBC118, suggesting that combining structure prediction with experimental information will better reveal biologically relevant conformations.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Chemical cross-linking of isolated ciliary proteins provides abundant intramolecular cross-links.
a Schematic of the protocol used to determine chemical cross-links among Tetrahymena thermophila ciliary proteins, from cell culture through ciliary isolation, incubation with the membrane-permeable cross-linker DSSO, to the use of tandem (MS1/MS2/MS3) mass spectrometry to identify the specific cross-linked peptides. Created with BioRender.com. b Examples of the most extensively intramolecularly cross-linked proteins observed. The corresponding Uniprot identifiers and amino acid sequences are provided for all proteins discussed in the supporting Zenodo archive, along with the precise locations of the cross-links.
Fig. 2
Fig. 2. In situ cross-links agree with the known T. thermophila outer dynein arm cryo-EM structure.
a Cross-link diagram for DYH3 shows the abundance of intramolecular cross-links within the protein. b We observed a total of 155 intramolecular cross-links across all three dynein heavy chain proteins, 124 of which corresponded to structured regions and hence could be used as a validation set. Intramolecular distances are plotted for these 124 cross-links. c Intramolecular cross-links mapped onto the DYH3 structure. In summary, there was a 97% agreement between cross-links and cryo-EM structures of the dynein proteins. d In situ assembly of ODAs, show that perceived monomer cross-link violations are actually satisfied between copies of dynein proteins, improving the cross-link agreement to 99% (PDB ID:7MOQ) (see also Supplementary Table S1 for specific values).
Fig. 3
Fig. 3. A general trend for fewer cross-link violations in AlphaFold2 models with higher pLDDT scores.
a Number of cross-link violations plotted against the pLDDT score for each of the T. thermophila proteins predicted. The size and shade of each dot represent the number of intramolecular cross-links for a given protein. The full data are provided as Supplementary Table S2. b A distance distribution view of the 43 proteins with no cross-link violations. c A selection of proteins from (b) with cross-links mapped onto the AF2-predicted structure.
Fig. 4
Fig. 4. Cross-link violations tend to occur between or outside of structurally well determined regions.
a The predicted alignment error (PAE) for T. thermophila protein BBC118 (Uniprot identifier I7ME23) with satisfied and violated cross-links plotted onto the heatmap. Blue circles are the satisfied cross-link and red x’s are the cross-link violations. b We apply a watershed model to the PAE heatmap to segment the protein into individual rigid bodies. For BBC118, all cross-link violations occur between segmented rigid bodies. c The protein rigid bodies were broken up by the segmentation from the PAE and modeled using the intramolecular cross-links as distance restraints to find an arrangement that satisfied all but one of the cross-links. All models and PAE plots are provided at the supporting Zenodo web site.
Fig. 5
Fig. 5. The proportion of cross-link violations is well-predicted by AF2’s Predicted Aligned Error score, suggesting that it accurately captures the accuracy of structural models.
Considering the full set of cross-links in the 100 proteins, we ranked all cross-linked amino acid pairs by increasing PAE scores and divided them into 49 bins, comprising 25 cross-links per bin. For each bin of PAE values, we plotted the mean PAE score (+/− 1 standard deviation) and the proportion of in situ cross-links violated within that bin (in the unrelaxed AF2-predicted structures). All relevant data are located in the Zenodo repository, accompanied by a Python notebook to compute raw and average distances.
Fig. 6
Fig. 6. Predictions for the protein eEF-2 show that the AF2 model differs from 4 homologous crystal structures and the cross-links due to interdomain rearrangements.
a Distribution of cross-link distances for proteins in our dataset with four or more cross-link violations (data are available on the supporting Zenodo site). b A hinge-like motion is evident between the two domains of the AF2 structure of the T. thermophila eEF-2 protein (Uniprot accession Q22DR0)(cyan) compared to four eEF-2 structures solved by X-ray crystallography and showing structures determined in the presence of different binding partners. All structures were superimposed on the N-terminal GTP binding domain. c indicates the distribution of 35 cross-link distances in each structure, with the appropriate PDB identifiers labeled to the right of the violin plots.

Similar articles

Cited by

References

    1. Pereira J, et al. High‐accuracy protein structure prediction in CASP14. Proteins Struct. Funct. Bioinforma. 2021;89:1687–1699. doi: 10.1002/prot.26171. - DOI - PubMed
    1. Jumper J, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. - DOI - PMC - PubMed
    1. Terwilliger, T. C., Poon, B. K., Afonine, P. V. et al. Improved AlphaFold modeling with implicit experimental information. Nat. Methods19, 1376–1382 (2022). - PMC - PubMed
    1. Jones DT, Thornton JM. The impact of AlphaFold2 one year on. Nat. Methods. 2022;19:15–20. doi: 10.1038/s41592-021-01365-3. - DOI - PubMed
    1. Skalidis I, et al. Cryo-EM and artificial intelligence visualize endogenous protein community members. Structure. 2022;30:575–589.e6. doi: 10.1016/j.str.2022.01.001. - DOI - PubMed

Publication types

MeSH terms