Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 25;16(1):6874.
doi: 10.1038/s41467-025-62261-4.

Deep-learning structure elucidation from single-mutant deep mutational scanning

Affiliations

Deep-learning structure elucidation from single-mutant deep mutational scanning

Zachary C Drake et al. Nat Commun. .

Abstract

Deep learning has revolutionized the field of protein structure prediction. AlphaFold2, a deep neural network, vastly outperformed previous algorithms to provide near atomic-level accuracy when predicting protein structures. Despite its success, there still are limitations which prevent accurate predictions for numerous protein systems. Here we show that sparse residue burial restraints from deep mutational scanning (DMS) can refine AlphaFold2 to significantly enhance results. Burial information extracted from DMS is used to explicitly guide residue placement during structure generation. DMS-Fold was validated on both simulated and experimental single-mutant DMS, with DMS-Fold outperforming AlphaFold2 for 88% of protein targets and with 252 proteins having an improvement greater than 0.1 in TM-Score. DMS-Fold is free and publicly available: [ https://github.com/LindertLab/DMS-Fold ].

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Heat maps depicting the correlation between changes in protein thermodynamic stabilities (ΔΔG) and solubility metrics, atomic depth (AD) and neighbor count (NC), for individual mutational types across the mega-scale set.
a Comparing ΔΔGs to native residue atomic depth. b Comparing ΔΔGs to native neighbor count. c Differences in correlation coefficients between atomic depth and neighbor count. d Comparing ΔΔGs to burial extent (defined as weighted average of both atomic depth and neighbor count).
Fig. 2
Fig. 2. DMS-fold overview.
a Atomic depth, neighbor count, and mutation ∆∆Gs are used to identify mutational types likely to be destabilizing for buried residues. These mutational types are used to calculate burial scores of residues from given mutational stabilities. b DMS-Fold network architecture based on the original OpenFold architecture. Residue burial information derived from deep mutational scanning data (a) is encoded as burial scores. These are then embedded into the pair representation along the diagonal. The pair representation, coupled with the MSA representation, is initialized before being processed by the Evoformer.
Fig. 3
Fig. 3. Performance of DMS-Fold on the 710 CASP14/CAMEO proteins with simulated changes in protein thermodynamic stabilities (∆∆Gs).
a Template modeling score (TM-Score) comparison of predictions from DMS-Fold and AlphaFold2 (N = 25) using a size-dependent number of nonredundant sequences (Neff). Size of each marker represents the Neff used for MSA subsampling. Color represents the change in network confidence, pLDDT, between DMS-Fold and AlphaFold2. b TM-Score distributions of both networks binned to TM-Scores of AlphaFold2 predictions (N = 25). c TM-Score distributions of predictions from both DMS-Fold and AlphaFold2 (N = 1) using different uniform Neff values. d Five predicted structures (aligned to native structure (grey)) where DMS-Fold with a size-dependent Neff (blue) had a TM-Score improvement > 0.5 compared to AlphaFold2 with no MSA-subsampling (orange). e Comparison of changes in pLDDTs and TM-Scores between predictions with DMS-Fold and AlphaFold2. Color represents the change in the difference of solubility metrics for the DMS-Fold structure and the native structure with the AlphaFold2 structure and the native structure. Points in panels a and b show the mean. In all box plots, the line shows the median and the whiskers represent the 1.5x interquartile range.
Fig. 4
Fig. 4. Performance of DMS-Fold on the 175 Mega-scale proteins using experimental changes in protein thermodynamic stabilities (∆∆Gs).
a Template modeling score (TM-Score) comparison of predictions from DMS-Fold and AlphaFold2 (N = 25) using a size-dependent number of nonredundant sequences (Neff). Size of each marker represents the Neff used for MSA subsampling. Color represents the change in network confidence, pLDDT between DMS-Fold and AlphaFold2. b TM-Score distributions of both networks binned to TM-Scores of AlphaFold2 predictions (N = 25). c TM-Score distributions of predictions from both DMS-Fold and AlphaFold2 using different uniform Neff values. d Top five predicted structures from AlphaFold2 with a size-dependent Neff (orange) and DMS-Fold (N = 1) with a size-dependent Neff (blue) aligned to their native structure (grey). e Comparison of changes in pLDDTs and TM-Scores between predictions with DMS-Fold and AlphaFold2. Color represents the change in the difference of solubility metrics for the DMS-Fold structure and the native structure with the AlphaFold2 structure and the native structure. Points in panels (a, b) show the mean. In all box plots, the line shows the median and the whiskers represent the 1.5x interquartile range.
Fig. 5
Fig. 5. Burial scores explicitly guide DMS-Fold inference.
a AlphaFold2 prediction of protein 2 A (PDB ID: 7BNY) with residues colored by encoded burial score (legend in panel c). b DMS-Fold prediction of protein 2 A with residues colored by encoded burial score. c Per-residue comparisons of predicted encoded burial scores and burial extents of the native, AlphaFold2, and DMS-Fold structures. d DMS-Fold prediction of protein 2 A using false encoded burial scores of zero for all residues (blue) compared to native structure (grey).

Similar articles

References

    1. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature596, 583–589 (2021). - PMC - PubMed
    1. Bertoline, L. M. F., Lima, A. N., Krieger, J. E. & Teixeira, S. K. Before and after AlphaFold2: An overview of protein structure prediction. Front. Bioinforma.3, 1120370 (2023). - PMC - PubMed
    1. Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science373, 871–876 (2021). - PMC - PubMed
    1. Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science379, 1123–1130 (2023). - PubMed
    1. Wu, R. et al. High-resolution de novo structure prediction from primary sequence. bioRxiv. 2022.2007.2021.500999 (2022). 10.1101/2022.07.21.500999

LinkOut - more resources