Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 6;15(1):9601.
doi: 10.1038/s41467-024-54059-7.

Predicting synthetic mRNA stability using massively parallel kinetic measurements, biophysical modeling, and machine learning

Affiliations

Predicting synthetic mRNA stability using massively parallel kinetic measurements, biophysical modeling, and machine learning

Daniel P Cetnar et al. Nat Commun. .

Abstract

mRNA degradation is a central process that affects all gene expression levels, though it remains challenging to predict the stability of a mRNA from its sequence, due to the many coupled interactions that control degradation rate. Here, we carried out massively parallel kinetic decay measurements on over 50,000 bacterial mRNAs, using a learn-by-design approach to develop and validate a predictive sequence-to-function model of mRNA stability. mRNAs were designed to systematically vary translation rates, secondary structures, sequence compositions, G-quadruplexes, i-motifs, and RppH activity, resulting in mRNA half-lives from about 20 seconds to 20 minutes. We combined biophysical models and machine learning to develop steady-state and kinetic decay models of mRNA stability with high accuracy and generalizability, utilizing transcription rate models to identify mRNA isoforms and translation rate models to calculate ribosome protection. Overall, the developed model quantifies the key interactions that collectively control mRNA stability in bacterial operons and predicts how changing mRNA sequence alters mRNA stability, which is important when studying and engineering bacterial genetic systems.

PubMed Disclaimer

Conflict of interest statement

H.M.S. is a founder of De Novo DNA. D.P.C, A.H., and G.E.V declare no competing interests.

Figures

Fig. 1
Fig. 1. Massively parallel design and kinetic decay measurements.
A 62,120 barcoded genetic systems with varied 5′ UTR sequences were constructed using oligopool synthesis and library-based cloning. mRNA stabilities in exponentially growing E. coli cells were measured using rifampicin treatment to halt transcription initiation, followed by kinetic mRNA level measurements using DNA-Seq and RNA-Seq. B mRNA sequences were designed with combinations of factors affecting mRNA decay rate, including RppH binding affinity, mRNA secondary & tertiary structure, and mRNA translation rate. C Time course mRNA level measurements (dots) were fitted to exponential decay functions (lines), using spike-in RNA controls for normalization. Four characterized mRNAs with widely different mRNA decay rates are shown. D The distribution of measured half-lives are shown.
Fig. 2
Fig. 2. Multi-factor sequence determinants controlling bacterial mRNA decay.
A Measured mRNA decay rates are shown when varying the predicted mRNA translation initiation rates, while only including 5′ UTRs with stable (less active) RppH binding sites and moderate single-stranded RNA lengths (Pearson R = −0.31, two-tailed p = 4.2 × 10−27). B Measured mRNA decay rates are shown when varying single-stranded RNA lengths inside the 5′ UTR, while only including 5′ UTRs with stable RppH binding sites and low predicted translation initiation rates (<5000 au) (small size effect = 0.0008 for 0 to 30 nt, Pearson R = −0.19, two-tailed p = 1.6 × 10−14). C Measured mRNA decay rates are shown when varying single-stranded RNA lengths inside the 5′ UTR, now including 5′ UTRs with stable RppH binding sites and high predicted translation initiation rates (>5000 au) (large size effect = 0.01 for 0 to 30 nt, R = 0.924, two-tailed p = 1.3 × 10−14). Measured mRNA decay rates are shown when varying the length and composition of single-stranded RNA regions inside 5′ UTRs, using ribosome binding sites with either (D) low or (E) high predicted translation initiation rates. Box plots show the median value of measured mRNA decay rates (N > 100 mRNAs) across each category (orange line), the 25% and 75% quartile boundaries (boxes), the maximum and minimum of the in-distribution data (bars), and the outliers (points). Measured mRNA decay rates are shown when introducing either (F) G-quadruplex tertiary structures or (G) i-motif tertiary structures into highly translated mRNAs as compared to highly translated mRNAs lacking each type of structure, each indexed by the amount of single-stranded RNA in their 5′ UTRs.
Fig. 3
Fig. 3. A hybrid biophysical-machine learning model of mRNA decay.
A The Promoter Calculator and RBS Calculator biophysical models are used to predict the transcription initiation rates and translation initiation rates of the five most predominant mRNA isoforms, followed by calculating the isoforms’ structural features. The biophysical features, measured mRNA levels, and measured mRNA decay rates of 45,283 characterized mRNAs are split into a training dataset (N = 36,219) and an unseen test dataset (N = 9064), followed by training and testing of a machine learning model (LightGBM). B Individually trained models accurately predicted mRNA levels at the steady-state timepoint (train R2 = 0.75, test R2 = 0.69) and post-rifampicin treatment timepoints (+2 min, train R2 = 0.72, test R2 = 0.65; +4 min, train R2 = 0.68, test R2 = 0.60; +8 min, train R2 = 0.61, test R2 = 0.53; +16 min, train R2 = 0.52, test R2 = 0.43). C The mRNA Stability Calculator accurately predicted mRNA decay rates by combining biophysical features and model-predicted mRNA levels at each timepoint (train R2 = 0.56, test R2 = 0.43).
Fig. 4
Fig. 4. Model-predicted design rules for controlling mRNA levels and decay rate.
A The RppH binding sites sequences ranked according to their model-predicted changes in steady-state mRNA levels. The top and bottom 10 RppH binding sites are shown that (red) decrease or (blue) increase mRNA stability. B Model-predicted design rules for changing steady-state mRNA levels when varying (left) translation initiation rate of a CDS, (middle) the amount of single-stranded RNA in the 5′ UTR with either polyA, polyG, polyC, or polyU composition, or (right) the amount of double-stranded RNA in the 5′ UTR.

Similar articles

Cited by

References

    1. LaFleur, T. L., Hossain, A. & Salis, H. M. Automated model-predictive design of synthetic promoters to control transcriptional profiles in bacteria. Nat. Commun.13, 5159 (2022). - PMC - PubMed
    1. Reis, A. C. & Salis, H. M. An automated model test system for systematic development and improvement of gene expression models. ACS Synth. Biol.9, 3145–3156 (2020). - PubMed
    1. Espah Borujeni, A. & Salis, H. M. Translation initiation is controlled by RNA folding kinetics via a ribosome drafting mechanism. J. Am. Chem. Soc.138, 7016–7023 (2016). - PubMed
    1. Cetnar, D. P. & Salis, H. M. Systematic quantification of sequence and structural determinants controlling mRNA stability in bacterial operons. ACS Synth. Biol.10, 318–332 (2021). - PubMed
    1. Zhang, Q. et al. Predictable control of RNA lifetime using engineered degradation-tuning RNAs. Nat. Chem. Biol.17, 828–836 (2021). - PMC - PubMed

Publication types

LinkOut - more resources