A narrow range of transcript-error rates across the Tree of Life

Weiyi Li¹, Stephan Baehr², Michelle Marasco³, Lauren Reyes², Danielle Brister², Craig S Pikaard³, Jean-Francois Gout⁴, Marc Vermulst⁵, Michael Lynch²

Affiliations

¹ Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA.
² Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85287, USA.
³ Department of Biology, Howard Hughes Medical Institute, Indiana University, Bloomington, IN 47405, USA.
⁴ Department of Biological Sciences, Mississippi State University, Mississippi State, MS 39762, USA.
⁵ Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA.

PMID: 40644547
PMCID: PMC12248287
DOI: 10.1126/sciadv.adv9898

A narrow range of transcript-error rates across the Tree of Life

Weiyi Li et al. Sci Adv. 2025.

. 2025 Jul 11;11(28):eadv9898.

doi: 10.1126/sciadv.adv9898. Epub 2025 Jul 11.

Authors

Weiyi Li¹, Stephan Baehr², Michelle Marasco³, Lauren Reyes², Danielle Brister², Craig S Pikaard³, Jean-Francois Gout⁴, Marc Vermulst⁵, Michael Lynch²

Affiliations

¹ Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA.
² Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85287, USA.
³ Department of Biology, Howard Hughes Medical Institute, Indiana University, Bloomington, IN 47405, USA.
⁴ Department of Biological Sciences, Mississippi State University, Mississippi State, MS 39762, USA.
⁵ Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA.

PMID: 40644547
PMCID: PMC12248287
DOI: 10.1126/sciadv.adv9898

Abstract

Although transcript-error rates are markedly higher than DNA-level mutation rates, a broad perspective on the degree to which they diverge across lineages remains to be developed. Using modified rolling-circle sequencing, we found a narrow range of transcript-error rates across the Tree of Life, with little evidence supporting local control of error rates associated with gene expression levels. Most errors result in missense changes if translated, and, as with a fraction of nonsense errors, these are underrepresented relative to random expectations, suggesting the existence of mechanisms for purging some such errors. To understand how natural selection and random genetic drift might shape transcript-error rates, we present a model based on cell biology and population genetics. However, while this framework helps understand the evolution of this highly conserved trait, as currently structured, it explains only 20% of the variation in the data, suggesting a need for further theoretical work in this area.

PubMed Disclaimer

Figures

**Fig. 1.. Rates and molecular spectra of transcript errors across species.**
(A) A summary of all available estimates of transcript-error rates from mRNAs transcribed from nuclear/nucleoid chromosomes and RNAs from mitochondria- and chloroplast-encoded RNAPs. Estimates for *Agrobacterium tumefaciens*, *Bacillus subtilis*, *E. coli*, *M. florum*, *C. elegans*, *D. melanogaster*, *H. sapiens*, and *M. musculus* were obtained from previous studies using the same modified CirSeq method (table S1). Error bars for estimates reported in this study denote standard errors of the mean error rates calculated from $\sqrt{μ (1 - μ) / n}$ , where $μ$ is the mean error rate of each species per category and $n$ denotes the corresponding number of rNTPs assayed. The phylogeny, constructed without branch lengths, was generated based on the NCBI taxonomy database and Wang and Luo (66). (B) The molecular spectra of transcript errors from nuclear/nucleoid chromosomal mRNAs. The conditional error rates of each type of rNTP substitution were calculated from the number of detected errors divided by the number of corresponding rNTPs evaluated. Error bars indicate standard errors.

**Fig. 2.. Characterization of potential functional effects of transcript errors.**
(A) Fractions of transcript errors from chromosomal protein–coding genes are classified as synonymous, missense, and nonsense if translated. Stop-codon loss errors detected in *A. thaliana*, *C. reinhardtii*, *P. tetraurelia*, and *P. caudatum* with minimal percentages are not displayed. (B) The ratio of observed (obs) to expected (exp) rates for each type of errors. The expected error rates were calculated, assuming a random generation of transcript errors according to the bias of rNTP substitution rates and codon usages of each species and in the absence of any correction mechanisms. Each dot represents a ratio from one species. The ratio of 1.0 is indicated by a dashed line.

**Fig. 3.. The variation of transcript-error rates across species.**
(A) Transcript-error and genomic-mutation rates (substitutions per nucleotide site per generation) for species across the Tree of Life. Genomic-mutation rate data are derived from Lynch *et al.* (4), Long *et al.* (67), and Lynch and Trickovic (5). Diagonal dashed lines are isoclines for ratios of the two rates. (B) Log-log regression of the observed transcript-error rates for unicellular species as a function of the composite parameter described in the text; the best fit (given by the regression line) is obtained with exponent value $x$ = 0.5 (with the statistical fits with alternative $x$ values given in the inset and the support interval within which the regression remains significant at the 0.05 level denoted by the dashed line). The data points for multicellular eukaryotes are given for reference but not used in the regression. $P \bar{L}$ denotes the total proteome size (summed over all codons; in megabases). Standard errors of the individual measures of transcript-error rates are generally smaller than the widths of the points. All data, including those for unlabeled bacteria (which include all species except for *Kineococcus*, for which $N_{e}$ data were unavailable), are shown in table S1. At = *A. thaliana*; Ce = *C. elegans*; Cr = *Chlamydomonas reinhardtii*; Dm = *D. melanogaster*; Hs = *H. sapiens*; Mf = *M. florum*; Mm = *M. musculus*; Pc = *P. caudatum*; Pt = *P. tetraurelia*; Sc = *S. cerevisiae*.

**Fig. 4.. Transcript-error rates of chromosomal protein–coding genes at different expression levels.**
A generalized linear model (Materials and Methods) was applied to expression levels (FPKM) and transcript-error rates of individual protein-coding genes to evaluate potential correlations. Expression levels of genes were obtained from CirSeq reads, which provide estimates for expression levels consistent with regular RNA-seq reads (fig. S1). Slopes with P values less than 0.05/21 (Bonferroni correction for regression analyses for 21 species) are considered statistically significant. Significant positive and negative regressions are highlighted with blue and red lines, respectively. For visualization purposes only, protein-coding genes are ranked according to expression levels (low to high) and grouped into bins of equal cumulative width in the total number of sequenced rNTPs with 5% increments. Discrete FPKM of some genes in *M. musculus*, *P. caudatum*, and *M. florum* result in a few of the 20 bins being empty, and these are combined with adjacent bins on the larger FPKM side. Error bars denote standard errors of the mean error rates calculated from $\sqrt{μ (1 - μ) / n}$ , where $μ$ is the mean error rate of the bin and $n$ denotes the number of rNTPs assayed in the corresponding bin.

See this image and copyright information in PMC

Update of

A Narrow Range of Transcript-error Rates Across the Tree of Life.
Li W, Baehr S, Marasco M, Reyes L, Brister D, Pikaard CS, Gout JF, Vermulst M, Lynch M. Li W, et al. bioRxiv [Preprint]. 2025 Jan 14:2023.05.02.538944. doi: 10.1101/2023.05.02.538944. bioRxiv. 2025. Update in: Sci Adv. 2025 Jul 11;11(28):eadv9898. doi: 10.1126/sciadv.adv9898. PMID: 39868080 Free PMC article. Updated. Preprint.

References

1. Ohta T., Slightly deleterious mutant substitutions in evolution. Nature 246, 96–98 (1973). - PubMed
1. Hartl D. L., Dykhuizen D. E., Dean A. M., Limits of adaptation: The evolution of selective neutrality. Genetics 111, 655–674 (1985). - PMC - PubMed
1. M. Lynch, The Origins of Genome Architecture (Oxford Univ. Press, 2007).
1. Lynch M., Ackerman M. S., Gout J.-F., Long H., Sung W., Thomas W. K., Foster P. L., Genetic drift, selection and the evolution of the mutation rate. Nat. Rev. Genet. 17, 704–714 (2016). - PubMed
1. Lynch M., Trickovic B., A theoretical framework for evolutionary cell biology. J. Mol. Biol. 432, 1861–1879 (2020). - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A narrow range of transcript-error rates across the Tree of Life

Affiliations

A narrow range of transcript-error rates across the Tree of Life

Authors

Affiliations

Abstract

Figures

Update of

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources