Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 24;146(3):2054-2061.
doi: 10.1021/jacs.3c10941. Epub 2024 Jan 9.

Improving Protein Expression, Stability, and Function with ProteinMPNN

Affiliations

Improving Protein Expression, Stability, and Function with ProteinMPNN

Kiera H Sumida et al. J Am Chem Soc. .

Abstract

Natural proteins are highly optimized for function but are often difficult to produce at a scale suitable for biotechnological applications due to poor expression in heterologous systems, limited solubility, and sensitivity to temperature. Thus, a general method that improves the physical properties of native proteins while maintaining function could have wide utility for protein-based technologies. Here, we show that the deep neural network ProteinMPNN, together with evolutionary and structural information, provides a route to increasing protein expression, stability, and function. For both myoglobin and tobacco etch virus (TEV) protease, we generated designs with improved expression, elevated melting temperatures, and improved function. For TEV protease, we identified multiple designs with improved catalytic activity as compared to the parent sequence and previously reported TEV variants. Our approach should be broadly useful for improving the expression, stability, and function of biotechnologically important proteins.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Design strategy for the optimization of protein expression and stability using ProteinMPNN. The design space is chosen to preserve the native protein function by fixing the amino acid identities of residues close to the ligand and those that are highly conserved in multiple sequence alignments. The protein backbone structure and fixed position information are input into ProteinMPNN, which generates new amino acid sequences likely to fold to the input structure. The backbone structure in loop regions can optionally be remodeled using RoseTTAfold joint inpainting to further idealize the input protein.
Figure 2
Figure 2
ProteinMPNN design improves myoglobin expression and thermostability. (a) Positions adjacent to the heme were kept fixed during the sequence design (shown in blue). Non-conserved regions (in yellow) were subjected to backbone remodeling. Inset shows the heme-binding site. (b) SEC traces of 20 designed myoglobin variants. (c) Soluble yield of myoglobin designs and native myoglobin (nMb, represented as a red dashed line). (d) CD melting temperature plots of dnMb19 compared to native myoglobin (signal reported in molar residue ellipticity (MRE)). (e) Absorbance plots of dnMb19 and native myoglobin (inset shows the temperature scan). (f) Structural alignment of the crystal structure (green) and AlphaFold2 (AF2) prediction (gray) of dnMb19. (g) Overlay of the crystal structure of native myoglobin (gray) and the crystal structure of dnMb19 (green, PDB: 8U5A). Non-conserved regions displayed in insets II and III were subjected to backbone redesign.
Figure 3
Figure 3
ProteinMPNN sequence design improves TEV protease expression, thermostability, and catalytic efficiency. (a) TEVd (PDB: 1LVM) input structure with positions fixed during redesign highlighted. Active site residues surrounding the substrate (blue), 50% most highly conserved residues (yellow), and catalytic residues (pink) are highlighted. Inset shows a zoomed-in view of the active site region. (b) SEC traces of the designed TEV variants. (c) Diagram of TEV substrate (top) and fluorescent gel image of TEV cleavage reactions at various time points (bottom). (d) CD melting temperature plots of the designed and native TEV (signal reported in molar residue ellipticity (MRE)). (e) Benchtop stability comparison of native TEVd and the designed variant assessed as activity measured over time incubated at 30 °C before inclusion in the assay. (f) Decreased evolutionary constraints correlate with higher soluble expression levels. Legend indicates regions fixed during the design (all designs have the active site fixed). (g) Designs made with the active site and 50% most conserved residues fixed during design exhibited the highest catalytic activity. Raw apparent rate is reported in relative fluorescence units (RFU) per second.

References

    1. Beadle B. M.; Shoichet B. K. Structural Bases of Stability–function Tradeoffs in Enzymes. J. Mol. Biol. 2002, 321 (2), 285–296. 10.1016/S0022-2836(02)00599-5. - DOI - PubMed
    1. Magliery T. J. Protein Stability: Computation, Sequence Statistics, and New Experimental Methods. Curr. Opin. Struct. Biol. 2015, 33, 161–168. 10.1016/j.sbi.2015.09.002. - DOI - PMC - PubMed
    1. Singh A.; Upadhyay V.; Upadhyay A. K.; Singh S. M.; Panda A. K. Protein Recovery from Inclusion Bodies of Escherichia Coli Using Mild Solubilization Process. Microb. Cell Fact. 2015, 14, 41.10.1186/s12934-015-0222-8. - DOI - PMC - PubMed
    1. Thomson R. E. S.; Carrera-Pacheco S. E.; Gillam E. M. J. Engineering Functional Thermostable Proteins Using Ancestral Sequence Reconstruction. J. Biol. Chem. 2022, 298 (10), 10243510.1016/j.jbc.2022.102435. - DOI - PMC - PubMed
    1. Rathore N.; Rajan R. S. Current Perspectives on Stability of Protein Drug Products during Formulation Fill and Finish Operations. Biotechnol. Prog. 2008, 24 (3), 504–514. 10.1021/bp070462h. - DOI - PubMed

Publication types