Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 22;18(8):e1010483.
doi: 10.1371/journal.pcbi.1010483. eCollection 2022 Aug.

SPEACH_AF: Sampling protein ensembles and conformational heterogeneity with Alphafold2

Affiliations

SPEACH_AF: Sampling protein ensembles and conformational heterogeneity with Alphafold2

Richard A Stein et al. PLoS Comput Biol. .

Abstract

The unprecedented performance of Deepmind's Alphafold2 in predicting protein structure in CASP XIV and the creation of a database of structures for multiple proteomes and protein sequence repositories is reshaping structural biology. However, because this database returns a single structure, it brought into question Alphafold's ability to capture the intrinsic conformational flexibility of proteins. Here we present a general approach to drive Alphafold2 to model alternate protein conformations through simple manipulation of the multiple sequence alignment via in silico mutagenesis. The approach is grounded in the hypothesis that the multiple sequence alignment must also encode for protein structural heterogeneity, thus its rational manipulation will enable Alphafold2 to sample alternate conformations. A systematic modeling pipeline is benchmarked against canonical examples of protein conformational flexibility and applied to interrogate the conformational landscape of membrane proteins. This work broadens the applicability of Alphafold2 by generating multiple protein conformations to be tested biologically, biochemically, biophysically, and for use in structure-based drug design.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Methodology.
An initially generated MSA, via MMSeqs2, is input into Alphafold2 within ColabFold to generate five structural models. For illustration, the model with the highest pLDDT, AF2’s ranking of model confidence, is shown in red. Residues for mutation are chosen, in this case the three residues in red mediating a contact point on the upper surface of the protein. These mutations are made across the entire MSA (ignoring gaps). This modified MSA is then input into ColabFold for generation of new models. With the contact points in red missing, Alphafold2 within ColabFold generates a new conformation based on the contacts shown in blue.
Fig 2
Fig 2. Sampling conformational flexibility of Adenylate Kinase (AK).
A) Crystal structures of Adenylate Kinase aligned on residues 1–25: closed, 1ake (lightorange) and open, 4ake (lightpink). B) AK model with the thickness of the chain based on the rmsf for the 15 AF2 models from the unmodified MSA. C) TM score plot comparing all AF2 models to the two crystal structures. The dashed line is the TM score between the two experimental structures. The color scale is based on the relative molprobity score, (MP–MPmin) / MPmin. D) Histogram of molprobity scores. In blue are all of the models. The hatched red plot is for the models parsed by excluding sets that are one standard deviation above the median of all models. E) TM score plot of the parsed set of models. F) PCA plot of the first two components for the parsed set of models. Note the outlier set to the lower right. G) Highlighted is the ATP binding region, 1ake (orange), 4ake (magenta), and a representative structure from the outliers in ‘F’ (red). H) PCA plot removing the misfolded models shown in ‘G’.
Fig 3
Fig 3. Sampling conformational flexibility of Ribose Binding Protein (RBP).
A) Crystal structures of RBP: open, 1ba2B (lightorange) and closed, 2dri (lightpink). B) RBP model with the thickness of the chain based on the rmsf for the 15 AF2 models from the unmodified MSA. C) TM score plot comparing all AF2 models to the two crystal structures. The dashed line is the TM score between the two experimental structures. The color scale is based on the relative molprobity score, (MP–MPmin) / MPmin. D) Histogram of molprobity scores. In blue are all of the models. The hatched red plot is for the models parsed by excluding sets that are one standard deviation from the median of all models. E) TM score plot of the parsed set of models. F) PCA plot of the first two components for the parsed set of models.
Fig 4
Fig 4. Sampling conformational flexibility of proteins with no structures in AF2 training set.
A) Models with the thickness of the chain based on the rmsf for the 15 AF2 models from the unmodified MSA. B) TM score plot for the 15 initial models. C) TM score plot for all models after parsing for molprobity score. D) PCA of the parsed models. In Fig D in S1 Text are plots of the experimental structures and the best model based on TM score.
Fig 5
Fig 5. Sampling conformational flexibility of proteins with one structure in AF2 training set.
A) Models with the thickness of the chain based on the rmsf for the 15 AF2 models from the unmodified MSA. B) TM score plot for the 15 initial models. C) TM score plot for all models after parsing for molprobity score. D) PCA of the parsed models. In Fig F in S1 Text are plots of the experimental structures and the best model based on TM score.

References

    1. AlQuraishi M. Machine learning in protein structure prediction. Current Opinion in Chemical Biology. 2021;65: 1–8. doi: 10.1016/j.cbpa.2021.04.005 - DOI - PubMed
    1. Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, et al.. Protein 3D Structure Computed from Evolutionary Sequence Variation. PLOS ONE. 2011;6: e28766. doi: 10.1371/journal.pone.0028766 - DOI - PMC - PubMed
    1. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al.. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596: 583–589. doi: 10.1038/s41586-021-03819-2 - DOI - PMC - PubMed
    1. Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)—Round XIV. Proteins: Structure, Function, and Bioinformatics. 2021;89: 1607–1617. doi: 10.1002/prot.26237 - DOI - PMC - PubMed
    1. Anfinsen CB. Principles that Govern the Folding of Protein Chains. Science. 1973;181: 223–230. doi: 10.1126/science.181.4096.223 - DOI - PubMed

Publication types