Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb 28;114(9):2265-2270.
doi: 10.1073/pnas.1614437114. Epub 2017 Feb 14.

Trade-offs between enzyme fitness and solubility illuminated by deep mutational scanning

Affiliations

Trade-offs between enzyme fitness and solubility illuminated by deep mutational scanning

Justin R Klesmith et al. Proc Natl Acad Sci U S A. .

Abstract

Proteins are marginally stable, and an understanding of the sequence determinants for improved protein solubility is highly desired. For enzymes, it is well known that many mutations that increase protein solubility decrease catalytic activity. These competing effects frustrate efforts to design and engineer stable, active enzymes without laborious high-throughput activity screens. To address the trade-off between enzyme solubility and activity, we performed deep mutational scanning using two different screens/selections that purport to gauge protein solubility for two full-length enzymes. We assayed a TEM-1 beta-lactamase variant and levoglucosan kinase (LGK) using yeast surface display (YSD) screening and a twin-arginine translocation pathway selection. We then compared these scans with published experimental fitness landscapes. Results from the YSD screen could explain 37% of the variance in the fitness landscapes for one enzyme. Five percent to 10% of all single missense mutations improve solubility, matching theoretical predictions of global protein stability. For a given solubility-enhancing mutation, the probability that it would retain wild-type fitness was correlated with evolutionary conservation and distance to active site, and anticorrelated with contact number. Hybrid classification models were developed that could predict solubility-enhancing mutations that maintain wild-type fitness with an accuracy of 90%. The downside of using such classification models is the removal of rare mutations that improve both fitness and solubility. To reveal the biophysical basis of enhanced protein solubility and function, we determined the crystallographic structure of one such LGK mutant. Beyond fundamental insights into trade-offs between stability and activity, these results have potential biotechnological applications.

Keywords: deep mutational scanning; fitness landscapes; high-throughput screening; protein solubility; yeast surface display.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Overview of solubility deep mutational scans for TEM-1.1 and LGK. (Left) Screens used in the present work. In YSD, the protein is exported to the surface and labeled by a fluorescent antibody that is specific for a C-terminal epitope tag. The top 5% of cells by fluorescence intensity are collected by FACS. For Tat export, a protein is fused to a C-terminal beta-lactamase that requires periplasm localization for activity. Variants are selected on plates containing high antibiotic concentrations. (Center and Right) Heat maps of solubility scores for selected residues of TEM-1.1 and LGK. Residues in the active site are indicated by (*), interface by (I), and proximal to the C terminus by (C).
Fig. 2.
Fig. 2.
Validation of solubility datasets. (A) Nonsense vs. missense solubility scores for YSD (LGK, TEM-1.1). (B) Fraction of beneficial mutations above the lower bounds versus contact number for LGK and TEM-1.1 (residues 61–215). Known stabilizing mutations (yellow) are mapped onto TEM-1.1 (C; PDB ID code 1M40) and LGK (D; PDB ID code 4ZLU). (Insets) Structural basis of the stabilizing mutations, shown as yellow sticks, along with the corresponding solubility scores identified by deep sequencing.
Fig. 3.
Fig. 3.
Distribution of solubility-enhancing mutations. (A) Frequency of mutations for TEM-1.1 YSD (blue) and LGK YSD (black) found at each solubility score. Each dataset is fit with a cubic spline to help guide the eye. (B) Positions with more than 10 beneficial mutations in the TEM-1.1 YSD dataset are shown as yellow sticks. These false-positive results are predicted to disrupt the C-terminal helix, presumably to promote accessibility of the c-myc epitope tag. (C) Percentage of mutations with solubility scores above a 10% (hatched fill) and 50% (solid fill) increase in function for TEM-1.1 YSD (blue) and LGK YSD (gray). TEM-1.1 YSD* covers residues 61–215 to remove the section with false-positive results indicated in B.
Fig. 4.
Fig. 4.
Classification methods improve probabilities of selecting mutations conferring solubility and activity but remove rare, globally optimal mutations. Classifier probabilities for YSD deep mutational scan for TEM-1.1 (A) and LGK (B). The total number of mutations found in a given bin (n) is provided, and the PSSM represents the site-specific preferences found in the evolutionary history of the enzyme. (C) Classification methods improve probabilities of selecting neutral mutations. (D) LGK fitness versus the LGK solubility score of individual mutations. Beneficial mutations from the YSD screen are shown as circles colored by whether they pass (red) or fail (yellow) the multiple-filter classification method. The Pareto optimal mutation G359R (boxed) fails the filtering due to its close distance to the active site, low evolutionary conservation, and high contact number. (E) Crystal structure of LGK G359R (PDB ID code 5TKR). G359R makes direct and water-mediated hydrogen bonds with ADP near the active site. A potassium ion also appears to be coordinated in this region, possibly contributing to the stability of the enzyme. Carbon atoms are shown in gray and yellow for the protein and ligand atoms, respectively. Nitrogen, oxygen, and phosphorous atoms are shown in blue, red, and orange, respectively. Waters and the potassium are shown as red and cyan spheres, respectively. The 2mFo-DFc electron density map is contoured to 1σ. For clarity, the magnesiums in the active site have been omitted from the figure.

References

    1. Kellogg EH, Leaver-Fay A, Baker D. Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Proteins. 2011;79(3):830–838. - PMC - PubMed
    1. Sormanni P, Aprile FA, Vendruscolo M. The CamSol method of rational design of protein mutants with enhanced solubility. J Mol Biol. 2015;427(2):478–490. - PubMed
    1. Goldenzweig A, et al. Automated structure- and sequence-based design of proteins for high bacterial expression and stability. Mol Cell. 2016;63(2):337–346. - PMC - PubMed
    1. Waldo GS. Genetic screens and directed evolution for protein solubility. Curr Opin Chem Biol. 2003;7(1):33–38. - PubMed
    1. Park S, et al. Limitations of yeast surface display in engineering proteins of high thermostability. Protein Eng Des Sel. 2006;19(5):211–217. - PubMed

Publication types

MeSH terms

Associated data