Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 11;11(28):eadv9898.
doi: 10.1126/sciadv.adv9898. Epub 2025 Jul 11.

A narrow range of transcript-error rates across the Tree of Life

Affiliations

A narrow range of transcript-error rates across the Tree of Life

Weiyi Li et al. Sci Adv. .

Abstract

Although transcript-error rates are markedly higher than DNA-level mutation rates, a broad perspective on the degree to which they diverge across lineages remains to be developed. Using modified rolling-circle sequencing, we found a narrow range of transcript-error rates across the Tree of Life, with little evidence supporting local control of error rates associated with gene expression levels. Most errors result in missense changes if translated, and, as with a fraction of nonsense errors, these are underrepresented relative to random expectations, suggesting the existence of mechanisms for purging some such errors. To understand how natural selection and random genetic drift might shape transcript-error rates, we present a model based on cell biology and population genetics. However, while this framework helps understand the evolution of this highly conserved trait, as currently structured, it explains only 20% of the variation in the data, suggesting a need for further theoretical work in this area.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.. Rates and molecular spectra of transcript errors across species.
(A) A summary of all available estimates of transcript-error rates from mRNAs transcribed from nuclear/nucleoid chromosomes and RNAs from mitochondria- and chloroplast-encoded RNAPs. Estimates for Agrobacterium tumefaciens, Bacillus subtilis, E. coli, M. florum, C. elegans, D. melanogaster, H. sapiens, and M. musculus were obtained from previous studies using the same modified CirSeq method (table S1). Error bars for estimates reported in this study denote standard errors of the mean error rates calculated from μ(1μ)/n , where μ is the mean error rate of each species per category and n denotes the corresponding number of rNTPs assayed. The phylogeny, constructed without branch lengths, was generated based on the NCBI taxonomy database and Wang and Luo (66). (B) The molecular spectra of transcript errors from nuclear/nucleoid chromosomal mRNAs. The conditional error rates of each type of rNTP substitution were calculated from the number of detected errors divided by the number of corresponding rNTPs evaluated. Error bars indicate standard errors.
Fig. 2.
Fig. 2.. Characterization of potential functional effects of transcript errors.
(A) Fractions of transcript errors from chromosomal protein–coding genes are classified as synonymous, missense, and nonsense if translated. Stop-codon loss errors detected in A. thaliana, C. reinhardtii, P. tetraurelia, and P. caudatum with minimal percentages are not displayed. (B) The ratio of observed (obs) to expected (exp) rates for each type of errors. The expected error rates were calculated, assuming a random generation of transcript errors according to the bias of rNTP substitution rates and codon usages of each species and in the absence of any correction mechanisms. Each dot represents a ratio from one species. The ratio of 1.0 is indicated by a dashed line.
Fig. 3.
Fig. 3.. The variation of transcript-error rates across species.
(A) Transcript-error and genomic-mutation rates (substitutions per nucleotide site per generation) for species across the Tree of Life. Genomic-mutation rate data are derived from Lynch et al. (4), Long et al. (67), and Lynch and Trickovic (5). Diagonal dashed lines are isoclines for ratios of the two rates. (B) Log-log regression of the observed transcript-error rates for unicellular species as a function of the composite parameter described in the text; the best fit (given by the regression line) is obtained with exponent value x = 0.5 (with the statistical fits with alternative x values given in the inset and the support interval within which the regression remains significant at the 0.05 level denoted by the dashed line). The data points for multicellular eukaryotes are given for reference but not used in the regression. PL¯ denotes the total proteome size (summed over all codons; in megabases). Standard errors of the individual measures of transcript-error rates are generally smaller than the widths of the points. All data, including those for unlabeled bacteria (which include all species except for Kineococcus, for which Ne data were unavailable), are shown in table S1. At = A. thaliana; Ce = C. elegans; Cr = Chlamydomonas reinhardtii; Dm = D. melanogaster; Hs = H. sapiens; Mf = M. florum; Mm = M. musculus; Pc = P. caudatum; Pt = P. tetraurelia; Sc = S. cerevisiae.
Fig. 4.
Fig. 4.. Transcript-error rates of chromosomal protein–coding genes at different expression levels.
A generalized linear model (Materials and Methods) was applied to expression levels (FPKM) and transcript-error rates of individual protein-coding genes to evaluate potential correlations. Expression levels of genes were obtained from CirSeq reads, which provide estimates for expression levels consistent with regular RNA-seq reads (fig. S1). Slopes with P values less than 0.05/21 (Bonferroni correction for regression analyses for 21 species) are considered statistically significant. Significant positive and negative regressions are highlighted with blue and red lines, respectively. For visualization purposes only, protein-coding genes are ranked according to expression levels (low to high) and grouped into bins of equal cumulative width in the total number of sequenced rNTPs with 5% increments. Discrete FPKM of some genes in M. musculus, P. caudatum, and M. florum result in a few of the 20 bins being empty, and these are combined with adjacent bins on the larger FPKM side. Error bars denote standard errors of the mean error rates calculated from μ(1μ)/n , where μ is the mean error rate of the bin and n denotes the number of rNTPs assayed in the corresponding bin.

Update of

Similar articles

References

    1. Ohta T., Slightly deleterious mutant substitutions in evolution. Nature 246, 96–98 (1973). - PubMed
    1. Hartl D. L., Dykhuizen D. E., Dean A. M., Limits of adaptation: The evolution of selective neutrality. Genetics 111, 655–674 (1985). - PMC - PubMed
    1. M. Lynch, The Origins of Genome Architecture (Oxford Univ. Press, 2007).
    1. Lynch M., Ackerman M. S., Gout J.-F., Long H., Sung W., Thomas W. K., Foster P. L., Genetic drift, selection and the evolution of the mutation rate. Nat. Rev. Genet. 17, 704–714 (2016). - PubMed
    1. Lynch M., Trickovic B., A theoretical framework for evolutionary cell biology. J. Mol. Biol. 432, 1861–1879 (2020). - PMC - PubMed

LinkOut - more resources