Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct 1;35(10):2345-2354.
doi: 10.1093/molbev/msy141.

Biophysical Inference of Epistasis and the Effects of Mutations on Protein Stability and Function

Affiliations

Biophysical Inference of Epistasis and the Effects of Mutations on Protein Stability and Function

Jakub Otwinowski. Mol Biol Evol. .

Abstract

Understanding the relationship between protein sequence, function, and stability is a fundamental problem in biology. The essential function of many proteins that fold into a specific structure is their ability to bind to a ligand, which can be assayed for thousands of mutated variants. However, binding assays do not distinguish whether mutations affect the stability of the binding interface or the overall fold. Here, we introduce a statistical method to infer a detailed energy landscape of how a protein folds and binds to a ligand by combining information from many mutated variants. We fit a thermodynamic model describing the bound, unbound, and unfolded states to high quality data of protein G domain B1 binding to IgG-Fc. We infer distinct folding and binding energies for each mutation providing a detailed view of how mutations affect binding and stability across the protein. We accurately infer the folding energy of each variant in physical units, validated by independent data, whereas previous high-throughput methods could only measure indirect changes in stability. While we assume an additive sequence-energy relationship, the binding fraction is epistatic due its nonlinear relation to energy. Despite having no epistasis in energy, our model explains much of the observed epistasis in binding fraction, with the remaining epistasis identifying conformationally dynamic regions.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
Thermodynamic model of three protein states: unfolded and unbound, folded and unbound, and folded and bound, described by equation (3). Shaded areas correspond to regions in energy space where the labeled state is dominant. A) Given sequences and binding fractions pfb, the nonlinear Boltzmann form (eq. 3) imposes constraints on the possible parameters (additive energies). Solid lines are the energies compatible with pfb for four hypothetical sequences: wild-type (WT) two single mutants (Mut A, Mut B) and a double mutant Mut AB. Dashed lines represent additive energies, and connect the wild-type to the single mutants and to the double mutant, which has lengths equal to the sum of additive energies. B) Folding versus binding energy for single mutants (additive effects). Gray dot is the wild-type energy, dashed line is where binding fraction is the same as wild-type pfb=pfbW, and below which binding fraction is higher.
<sc>Fig</sc>. 2.
Fig. 2.
(A) Prediction of changes in folding energy gf (eq. 5, commonly referred to as ΔΔG) by fitting a three state thermodynamic model to deep mutational scanning data. Predicted energies have a root mean square error of 0.39 kcal/mol and ρ=0.91 compared with a literature set of folding energies of 81 single mutants (table S4, Olson et al. 2014). In the inset are eight variants with two to seven mutations, including two highly stable engineered variants which are poorly predicted, suggesting nonadditivity in stability between the mutated sites (table S5, Olson et al. 2014). (B) Marginal binding energy is well approximated by the two state energy when binding is weak and fold stability is strong: Gf<0<GbG. Main plot shows all single mutants with Gf < 0 (black, ρ=0.96) and Gf > 0 (gray). Inset shows the subset of 81 single mutants with measured stabilities, as in (A), and ρ=0.99. All lines have a slope of unity.
<sc>Fig</sc>. 3.
Fig. 3.
Inferred additive binding gb and folding gf energies show strikingly different patterns. Three of the binding sites (27, 31, 43) have strong effects on binding. Many mutations at positions 23 and 41 are beneficial for folding and deleterious for binding, although overall gb and gf are uncorrelated (ρ=0.03,pvalue=.28).
<sc>Fig</sc>. 4.
Fig. 4.
Patterns of pairwise epistasis observed in the data and predicted by the two (A) and three (B) state thermodynamic models (above the diagonal). Below the diagonal in (A) and (B) are observed pairwise fitness epistasis (eq. 2) averaged across amino acids for each pair of sites. The correlation between observed and predicted epistasis is ρ=0.50 and 0.66 for two and three state models, respectively, and the accuracy of predicting the sign, that is, number of predictions that have same sign as observed divided by total, is 0.54 and 0.73 for two and three state models. Nonbiological epistasis that is a consequence of the experimental limits on measured fitness due to nonspecific background binding was filtered out in all calculations and plots (see Materials and Methods). (C) Above the diagonal is the observed epistasis for sites where the three state model overestimates, J^ijJij (top 5% of normalized errors, see Materials and Methods), and below the diagonal is where it underestimates, J^ijJij (bottom 5% of normalized errors). Underestimated epistasis is largely positive and corresponds to a dynamically correlated network of residues.
<sc>Fig</sc>. 5.
Fig. 5.
Fitness is less predictable in a follow-up study that targeted all combinations at four sites (Wu et al. 2016). Panels show true and inferred fitness for one to four mutations from wild-type. A substantial fraction of functional variants are underestimated, suggesting some unaccounted epistasis. See also supplementary figure S5 and table S1, Supplementary Material online.

References

    1. Adams RM, Mora T, Walczak AM, Kinney JB.. 2016. Measuring the sequence-affinity landscape of antibodies with massively parallel titration curves. eLife 5:e23156.. - PMC - PubMed
    1. Liberles DA, Teichmann SA, Bahar I, Bastolla U, Bloom J, Bornberg-Bauer E, Colwell LJ, de Koning APJ, Dokholyan NV, Echave J, et al. 2012. The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci. 216:769–785. - PMC - PubMed
    1. Araya CL, Fowler DM, Chen W, Muniez I, Kelly JW, Fields S.. 2012. A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc Natl Acad Sci USA. 10942:16858.. - PMC - PubMed
    1. Bastolla U, Dehouck Y, Echave J.. 2017. What evolution tells us about protein physics, and protein physics tells us about evolution. Curr Opin Struct Biol. 42:59–66. - PubMed
    1. Bershtein S, Serohijos AWR, Shakhnovich EI.. 2017. Bridging the physical scales in evolutionary biology: from protein sequence space to fitness of organisms and populations. Curr Opin Struct Biol. 42:31–40. - PMC - PubMed

Publication types