Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 29;10(1):142-155.
doi: 10.1093/emph/eoac010. eCollection 2022.

Mutation rate of SARS-CoV-2 and emergence of mutators during experimental evolution

Affiliations

Mutation rate of SARS-CoV-2 and emergence of mutators during experimental evolution

Massimo Amicone et al. Evol Med Public Health. .

Abstract

Background and objectives: To understand how organisms evolve, it is fundamental to study how mutations emerge and establish. Here, we estimated the rate of mutation accumulation of SARS-CoV-2 in vitro and investigated the repeatability of its evolution when facing a new cell type but no immune or drug pressures.

Methodology: We performed experimental evolution with two strains of SARS-CoV-2, one carrying the originally described spike protein (CoV-2-D) and another carrying the D614G mutation that has spread worldwide (CoV-2-G). After 15 passages in Vero cells and whole genome sequencing, we characterized the spectrum and rate of the emerging mutations and looked for evidences of selection across the genomes of both strains.

Results: From the frequencies of the mutations accumulated, and excluding the genes with signals of selection, we estimate a spontaneous mutation rate of 1.3 × 10 -6 ± 0.2 × 10-6 per-base per-infection cycle (mean across both lineages of SARS-CoV-2 ± 2SEM). We further show that mutation accumulation is larger in the CoV-2-D lineage and heterogeneous along the genome, consistent with the action of positive selection on the spike protein, which accumulated five times more mutations than the corresponding genomic average. We also observe the emergence of mutators in the CoV-2-G background, likely linked to mutations in the RNA-dependent RNA polymerase and/or in the error-correcting exonuclease protein.

Conclusions and implications: These results provide valuable information on how spontaneous mutations emerge in SARS-CoV-2 and on how selection can shape its genome toward adaptation to new environments. Lay Summary: Each time a virus replicates inside a cell, errors (mutations) occur. Here, via laboratory propagation in cells originally isolated from the kidney epithelium of African green monkeys, we estimated the rate at which the SARS-CoV-2 virus mutates-an important parameter for understanding how it can evolve within and across humans. We also confirm the potential of its Spike protein to adapt to a new environment and report the emergence of mutators-viral populations where mutations occur at a significantly faster rate.

Keywords: SARS-CoV-2; experimental evolution; mutation rate; mutator; virus adaptation.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Experimental design and mutation accumulation after 15 passages of SARS-CoV-2 evolution. (a) Schematic of the experimental design of the mutation accumulation experiments where two viral backgrounds were propagated in Vero cells (figure created with BioRender.com). (b) Number of mutations observed in each well and group; 15 lines of the CoV-2-G background accumulated a larger number of mutations and thus were defined as mutators (gold). The means of each group are presented by vertical dashed lines and reported in the figure (± 2 SEM). (c) Proportion of mutation types in each group. Complex mutations and multi-nucleotide polymorphisms (MNP) are defined in the Methodology section. (d) Mutation accumulation per base per infection cycle (Ma) was calculated by summing the observed mutation frequencies as: Ma=fP*G, where P is the number of passages (P = 15) and G is the SARS-CoV-2 genome length (G = 29 903). The means of each group are presented by vertical dashed lines and reported in the figure (±2 SEM). (e) Proportion of observed nucleotide changes. Dashed lines indicate the expectation given the genome composition under equal mutation probability for each type of nucleotide change. Vertical bars in panels (c) and (e) represent the 95% confidence interval computed as p±zp(1-p)N, z=1.96
Figure 2.
Figure 2.
Site frequency spectrum. Proportion of mutations with a given frequency after 15 cycles of propagation in the CoV-2-D (n = 96) and CoV-2-G (n = 79) genetic backgrounds or under a simulated neutral model of mutation accumulation (n = 100, see Methodology). The bump observed at high frequencies in the data is not compatible with the expectation of the neutral model. Here, for the neutral model, we assumed 15 cycles of growth (Poisson-distributed burst size), mutations and bottlenecks without selection (see the effects of larger variation in the viral burst size and of selection in Supplementary Figs S3c and d and S9, respectively)
Figure 3.
Figure 3.
Heterogeneity of mutation accumulation across genes. Per-base mutation accumulation (Ma) computed for each gene and for the entire genome shows heterogeneity. The spike gene has the largest accumulation rate in both backgrounds (Ma(S)=17.1 ± 1.0, 13.5 ±0.4 ·10-6, for the CoV-2-D and CoV-2-G respectively), which is more than four times their genomic average. For resolution purposes, few outliers with Ma above 45 are not shown (see full set in Supplementary Fig. S4)
Figure 4.
Figure 4.
Gene-wise signs of selection. (a) The relative proportion of non-synonymous to synonymous polymorphism, pN/pS, was computed for each gene and genetic background (see Methodology). The horizontal line indicates the expectation under neutrality (pN/pS=1), values above suggest positive selection while values below suggest purifying selection. Vertical bars show the 95% distribution of bootstrapped resampling (n = 1000) and the stars indicate the genes where pN/pS significantly differs from 1 (two proportions z-test, P-value < 0.05 (*), 0.01 (**) or 0.001 (***), after Benjamini–Hochberg correction). For the sake of resolution, we show the confidence intervals within the [10-3, 103] range. (b) Identifying the genes that affect the estimation of mutation rate. Per-base mutation accumulation (Ma) was computed for the entire genome or by excluding each gene on at the time (e.g. ΔS). The stars indicate the cases where removing the gene leads to an estimation of Ma significantly different from the all genome (non-parametric Wilcox test, P-value < 0.05 (*), 0.01 (**) or 0.001 (***), after Benjamini–Hochberg correction)
Figure 5.
Figure 5.
Estimation of mutation rates and bias excluding outlier genes. (a) The per-base per-infection cycle mutation rate was calculated by summing the observed mutation frequencies as: μ=fP*G, where P is the number of passages (P = 15) and G is the length of SARS-CoV-2 genome excluding the Nsp3, Nsp6 and Spike genes (29903-5835-870-3822 = 19376). The means of each group are presented as vertical dashed lines and reported in the figure (± 2 SEM). (b) Proportion of nucleotide changes observed excluding the Nsp3, Nsp6 and Spike genes. Dotted lines indicate the expectation given the genome composition under equal mutation probability for each type of nucleotide change. Vertical bars represent the 95% confidence interval computed as p±zp(1-p)N, z=1.96
Figure 6.
Figure 6.
Convergent evolution in the Spike gene. (a) Amino acids of S targeted in both CoV-2-D and CoV-2-G backgrounds and their frequencies in each well (open circles). (b) Non-synonymous mutations on the spike detected in the populations where the mutators were observed (number of wells on the X-axis). The color annotation represents the N-terminal domain (NTD, 14–305), the receptor-binding domain (RBD, 319–541), the cleavage site (S1/S2, 669–688), the fusion peptide (FP, 788–806), the heptapeptide repeat sequences (HR1, 912–984 and HR2, 1163–1213), the TM domain (1213–1237) and cytoplasm domain (CP, 1237–1273)

References

    1. Lin JJ, Bhattacharjee MJ, CP Y. et al. Many human RNA viruses show extraordinarily stringent selective constraints on protein evolution. Proc Natl Acad Sci U S A 2019;116:19009–18.[CVOCROSSCVO] - PMC - PubMed
    1. Chao L. Fitness of RNA virus decreased by Muller’s ratchet. Nature 1990;348:454–5. - PubMed
    1. Duarte E, Clarke D, Moya A. et al. Rapid fitness losses in mammalian RNA virus clones due to Muller’s ratchet. Proc Natl Acad Sci U S A 1992;89:6015–9. - PMC - PubMed
    1. Lynch M, Ackerman MS, Gout JF. et al. Genetic drift, selection and the evolution of the mutation rate. Nat Rev Genet 2016;17:704–14. - PubMed
    1. Bull JJ, Badgett MR, Wichman HA. et al. Exceptional convergent evolution in a virus. Genetics 1997;147:1497–507. - PMC - PubMed