Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 22;13(1):1536.
doi: 10.1038/s41467-022-28776-w.

Combinatorial optimization of mRNA structure, stability, and translation for RNA-based therapeutics

Affiliations

Combinatorial optimization of mRNA structure, stability, and translation for RNA-based therapeutics

Kathrin Leppek et al. Nat Commun. .

Abstract

Therapeutic mRNAs and vaccines are being developed for a broad range of human diseases, including COVID-19. However, their optimization is hindered by mRNA instability and inefficient protein expression. Here, we describe design principles that overcome these barriers. We develop an RNA sequencing-based platform called PERSIST-seq to systematically delineate in-cell mRNA stability, ribosome load, as well as in-solution stability of a library of diverse mRNAs. We find that, surprisingly, in-cell stability is a greater driver of protein output than high ribosome load. We further introduce a method called In-line-seq, applied to thousands of diverse RNAs, that reveals sequence and structure-based rules for mitigating hydrolytic degradation. Our findings show that highly structured "superfolder" mRNAs can be designed to improve both stability and expression with further enhancement through pseudouridine nucleoside modification. Together, our study demonstrates simultaneous improvement of mRNA stability and protein expression and provides a computational-experimental platform for the enhancement of mRNA medicines.

PubMed Disclaimer

Conflict of interest statement

Stanford University has submitted provisional patent applications related to use of the Hoxa9 P4 stem-loop (K.L. and M.B.), and the SARS-CoV2 5′ UTR (K.L. and M.B.), computational design of mRNAs (H.K.W.-S., D.S.K., C.C., E.S., R.D.), chemically modified nucleotides to stabilize RNA therapeutics (W.K. and R.D.), and design of reporter mRNAs and the PERSIST-seq platform (M.B., G.W.B., and K.L.). R.D., D.S.K., W.K., and C.H.K. have initiated a commercial venture focused on improving RNA design. R.D. received an honorarium for speaking at Pfizer′s 2021 mRNA science day. F.D., H.C., P.G., J.W., F.M., S.S., A.S., and P.R.D. are or were employees of Pfizer and may hold stock options. P.R.D. is currently an employee and shareholder of GlaxoSmithKline (GSK).

Figures

Fig. 1
Fig. 1. PERSIST-seq overview and illustrative ribosome load insights.
a Overview of the mRNA optimization workflow. Literature mined and rationally designed 5′ and 3′ UTRs were combined with Eterna and algorithmically designed coding sequences. All sequences were then experimentally tested in parallel for in-solution and in-cell stability as well as ribosome load. The mRNA design included unique, 6–9 nt barcodes in the 3′ UTR for tag counting by short-read sequencing. b Experimental design for testing in-solution and in-cell stability and ribosome load in parallel. mRNAs were in vitro transcribed, 5′ capped, and polyadenylated in a pooled format before transfection into HEK293T cells or being subjected to in-solution degradation. Transfected cells were then harvested for sucrose gradient fractionation or in-cell degradation analysis. c Polysome trace from transfected HEK293T cells with 233-mRNA pool. d 5′ UTR variants display a higher variance in mean ribosome load per construct as determined from polysome sequencing. The formula for ribosome load is given. Box hinges: 25% quantile, median, 75% quantile, respectively, from left to right. Whiskers: lower or upper hinge ±1.5 x interquartile range. e Heatmaps from polysome profiles of mRNA designs selected from the top, middle, and bottom five mRNAs (by ribosome load) from each design category. f Secondary structure model of the SARS-CoV-2 5′ UTR. Introduced mutations and substitutions are highlighted. g Heatmaps of SARS-CoV-2 5′ UTR variants’ polysome profiles sorted by ribosome load.
Fig. 2
Fig. 2. In-cell RNA stability drives downstream protein expression levels.
a In-cell half-life of each mRNA design in HEK293T cells. Box hinges: 25% quantile, median, 75% quantile, respectively, from left to right. Whiskers: lower or upper hinge ±1.5 x interquartile range. b Higher polysome load correlates with decreased in-cell half-life. Correlation between in-cell half-life and mean ribosome load across the entire profile (left), monosome-to-free subunit ratio (center), or polysome-to-monosome ratio (right). c In-cell half-life and mean ribosome load for individual mRNA designs with varying UTRs. d Kinetic model for predicting protein expression from mRNA half-life and ribosome load. P(t) is protein quantity at time t; kt is translation rate; and km and kp are rates of mRNA and protein decay, respectively. e Protein expression predicted using the kinetic model in (d) on the basis of mRNA half-life and ribosome load. Predicted protein expression of each UTR variant; note closer similarity to in-cell half-life data than to ribosome load in (c). f Correlation of predicted protein expression and Nluc/Fluc activity at 12 h in HEK293T cells. Predicted protein expression is normalized by mRNA length (corresponding to transfecting equal masses of each mRNA). g In-solution half-life of various mRNA design variants. mRNA lifetimes are strongly dependent on mRNA length and designed structures, revealed by time courses of mRNA degradation under accelerated aging conditions (10 mM MgCl2, 50 mM Na-CHES, pH 10.0). Box hinges: 25% quantile, median, 75% quantile, respectively, from left to right. Whiskers: lower or upper hinge ±1.5 x interquartile range. h Nucleotide-resolution in vitro DMS mapping confirms large differences in structural accessibility between a highly structured JEV-HA-Nluc mRNA construct, “LinearDesign-1” and a highly unstructured construct “Yellowstone”. The 5′ and 3′ UTRs (hHBB) were kept constant between designs. Each point represents normalized DMS reactivity from one nucleotide position of the RNA. Box plot represents median and 25th and 75th percentiles—interquartile range; IQR—and whiskers extend to maximum and minimum values. i Nucleotide DMS accessibility mapped onto structures from DMS-directed structure prediction.
Fig. 3
Fig. 3. High-throughput in-line hydrolysis uncovers principles of in-solution RNA degradation.
a Eterna participants were asked to design 68-nucleotide RNA fragments maximizing sequence and structure diversity. In total, 3030 constructs were characterized and probed using high-throughput in-line degradation (In-line-seq). b Nucleotide-resolution degradation of 2165 68-nt RNA sequences (filtered for signal quality), probed by In-line-seq, sorted by hierarchical clustering on degradation profiles. c Sequences span a diverse set of secondary structure motifs, revealing patterns in degradation based on both sequence (i.e., linkages ending at 3′ uridine are particularly reactive) and structure (symmetric internal loops, circled, have suppressed hydrolytic degradation compared to asymmetric internal loops). d The ridge regression model “DegScore” was trained to predict per-nucleotide degradation from sequence and loop assignment information. Coefficients with the largest magnitude corresponded to sequence identity immediately after the link, with U being most disfavored. e DegScore showed improved predictive power on mRNAs over two other metrics previously posited to predict RNA stability. Half-life: in-solution mRNA half-life, calculated from degradation coefficients of the exponential decay fit on time course data in PERSIST-seq. Errors are standard deviations estimated by exponential fits to bootstrapped data. dG(MFE): Free energy of minimum free energy structure, calculated in RNAfold v2.4.14. Sum p(unpaired): Sum of unpaired probability, calculated in RNAfold v2.4.14. f Introduction of pseudouridine (ψ) and N1-methylpseudouridine (m1ψ) modifications stabilizes selected short RNAs at U nucleotides in both loop motifs and in fully unstructured RNAs. g Capillary electrophoresis characterization of fragmentation time courses of Nluc mRNA molecules designed with extensive structure (LinearDesign-1) and relatively less structure (Yellowstone), synthesized with standard nucleotides and with ψ modifications. The full-length mRNA band is indicated with a red asterisk. The Tetrahymena ribozyme P4-P6 domain RNA was included after degradation as a control. This result has been repeated independently two times with similar results (cf. Supplementary Fig. 10). h Exponential fits of capillary electrophoresis measurements of intact RNA over ten time points confirm significant differences between in-solution lifetimes of LinearDesign-1 and Yellowstone Nluc mRNAs. Inset: Calculated half-lives. mRNA half-life data are presented as mean values ± SD, as estimated from one biological replicate via bootstrapped exponential. Asterisks correspond to two-sided significance tests with ****p < 0.0001, ***p < 0.001, **p < 0.01.
Fig. 4
Fig. 4. Integration of 5′/3′ UTRs, structure-optimized CDSs, and pseudouridine (ψ) together enhance mRNA stability and translational output.
a CDS and 5′/3′UTR combinations differentially impact protein synthesis. Six mRNA constructs were in vitro synthesized and luciferase activity was measured 6 or 24  h post-transfection. Inclusion of ψ was tested on two selected constructs. Bars indicate the geometric mean of Nluc/Fluc reporter activity ratios normalized relative to Nluc start/hHBB UTRs. Error bars indicate geometric standard deviation. n = 4 biologically independent samples. b Workflow for different approaches to design the CDS variants tested in (c). c Variations in CDS design facilitate high in-solution stability and differential protein expression. In vitro transcribed mRNAs (24 in total) were subjected to in-solution degradation or transfected into HEK293T cells for 6 and 24 h. In-solution half-lives and luciferase activity are normalized to the Nluc start reference construct. Predicted secondary structures are shown for select constructs with colors indicating DegScore at each nucleotide. Designs derived from LinearDesign solutions are marked with a purple triangle. Asterisks correspond to two-sided significance tests with ****p < 0.0001, ***p < 0.001, **p < 0.01. Exact p-values are provided in Supplementary Data 5. Bars indicate the mean of Nluc/Fluc reporter activity ratios normalized relative to Nluc start. Error bars indicate standard deviation across n ≥ 3 biologically independent samples. d Predicted secondary structure overview of Ribotree_LinearDesign_degscoreall_1. Zoomed boxes indicate sequence optimizations and subsequent structural changes made by DegScore to the reference LinearDesign construct. e Increased in-solution half-life correlates with DegScore. Significance test for Spearman correlation value: two-sided p-value for a hypothesis test whose null hypothesis is that two sets of data are uncorrelated, n = 24. Error bars indicate standard deviation across n ≥ 3 biologically independent samples.
Fig. 5
Fig. 5. Stability and cellular expression of selected highly structured RNA designs in solution and formulated with polyplex.
a Schematic for testing the synergy between RNA modifications and mRNA design rules on downstream stability and protein output. mRNAs were in vitro synthesized with or without ψ and subjected to degradation conditions. Samples were collected overtime and the RNA was purified before being transfected into HEK293T cells. Luciferase activity was measured 24 h after transfection. b Luciferase activity of the reference Nluc sequence and DegScore-optimized CDS with or without ψ after being subjected to in-solution degradation. mRNA half-lives (t1/2) per construct are given in hours (hrs). Plotted on y-axis are the geometric mean of Nluc/Fluc reporter activity ratios normalized to time zero. Error bars indicate geometric standard deviation. n = 4 biologically independent samples. c Schematic for testing the effect of RNA formulation on downstream stability and protein output from selected RNA designs. mRNAs were in vitro synthesized, formulated with polyplex (PLX), and subjected to degradation conditions and/or expression analysis. Samples were collected over time and the formulated RNA was added to HEK293T cells. d In vitro stability of RNAs formulated with polyplex over 14 days at 5 °C. RNA half-lives were calculated based on the degradation slopes: Nluc start (reference) (14 days), Genewiz_1 (30 days), BugacMan’s_Lost_LD + finetuning_mod_Deg-2-ed (58 days), RLT-10 (69 days) and Ribotree_LinearDesign_degscoreall_1 (46 days). Results correspond to technical duplicates. e Expression of Nluc from HEK293T cells transfected with selected RNA designs formulated with polyplex. Expression was measured by fluorescence after the RNAs were formulated with polyplex, incubated at 5 °C in degradation conditions for 0 and 14 days, and then added to the medium of the cultured cells. Results correspond to technical replicates; normalized Nluc/Fluc activity ± SD. n = 3; ns not significant. *p ≤ 0.05 was considered significant (two-tailed unpaired Student’s t-test; ns: p > 0.05; *p ≤ 0.05; **p ≤ 0.01; ***p ≤ 0.001; ****p ≤ 0.0001).

Update of

Similar articles

Cited by

References

    1. Kowalski PS, Rudra A, Miao L, Anderson DG. Delivering the messenger: advances in technologies for therapeutic mRNA delivery. Mol. Ther. 2019;27:710–728. - PMC - PubMed
    1. Pardi N, Hogan MJ, Porter FW, Weissman D. mRNA vaccines - a new era in vaccinology. Nat. Rev. Drug Discov. 2018;17:261–279. - PMC - PubMed
    1. Sahin U, Karikó K, Türeci Ö. mRNA-based therapeutics-developing a new class of drugs. Nat. Rev. Drug Discov. 2014;13:759–780. - PubMed
    1. Jackson NAC, Kester KE, Casimiro D, Gurunathan S, DeRosa F. The promise of mRNA vaccines: a biotech and industrial perspective. npj Vaccines. 2020;5:11. - PMC - PubMed
    1. Weng Y, et al. The challenge and prospect of mRNA therapeutics landscape. Biotechnol. Adv. 2020;40:107534. - PubMed

Publication types