Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Apr 5;19(4):e1011265.
doi: 10.1371/journal.ppat.1011265. eCollection 2023 Apr.

Developing an appropriate evolutionary baseline model for the study of SARS-CoV-2 patient samples

Affiliations
Review

Developing an appropriate evolutionary baseline model for the study of SARS-CoV-2 patient samples

John W Terbot 2nd et al. PLoS Pathog. .

Abstract

Over the past 3 years, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has spread through human populations in several waves, resulting in a global health crisis. In response, genomic surveillance efforts have proliferated in the hopes of tracking and anticipating the evolution of this virus, resulting in millions of patient isolates now being available in public databases. Yet, while there is a tremendous focus on identifying newly emerging adaptive viral variants, this quantification is far from trivial. Specifically, multiple co-occurring and interacting evolutionary processes are constantly in operation and must be jointly considered and modeled in order to perform accurate inference. We here outline critical individual components of such an evolutionary baseline model-mutation rates, recombination rates, the distribution of fitness effects, infection dynamics, and compartmentalization-and describe the current state of knowledge pertaining to the related parameters of each in SARS-CoV-2. We close with a series of recommendations for future clinical sampling, model construction, and statistical analysis.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. SARS-CoV-2 variant frequencies and genetic diversity through time, given for the state of Montana as illustration.
(A) Frequency of major WHO-defined variants of concern (VOCs), binned by week of sampling as derived from GISAID metadata. (B) Average pairwise nucleotide differences between consensus sequence genomes isolated from patient samples genome-wide (orange line) and within the S-gene encoding the spike protein (blue line). As shown, local spread of major VOCs induced corresponding drops in overall genetic diversity across consensus sequences and within the S-gene. Note that while Beta dominated early in Montana, multiple variants cocirculated at appreciable frequency in 2020 to 2021, and the dominant observed strain differed by location. Consensus sequences were downloaded from GISAID (n = 21,799), binned by week, and aligned to the SARS-CoV-2 reference sequence using Nextalign. Diversity was calculated in R, using the package PopGenome. Local polynomial regression fitting (method loess) was used in the R package ggplot2 to model diversity through time with the formula y ~ x. Case data (given by the light blue shading) were downloaded from the CDC.
Fig 2
Fig 2. The expected and estimated distributions of fitness effects (DFE) of mutations.
Left panels: a hypothetical expected DFE of new, segregating, and fixed mutations, reflecting the effects of selection at each stage. Effectively neutral mutations are shown in gray, beneficial mutations in blue, and deleterious mutation in shades of red. Right panels: the DFE of new, segregating, and fixed amino acid variants in SARS-CoV-2 as recently estimated by Flynn and colleagues [131], Kepler and colleagues [65], and Obermeyer and colleagues [132], respectively. From Flynn and colleagues, the normalized functional scores from 2 sets of biological replicates were pooled together. From Kepler and colleagues, the relative fitness of all single mutations from pre- and post-2020 studies were pooled together. From Obermeyer and colleagues, the DFE of fixed mutations was approximated by using 31 high-frequency variants (defined as those present in more than 100 lineages). Importantly, observed genomic variation will depend heavily on the underlying heterogeneity in both mutation rates and DFEs across the genome, among other factors [133].
Fig 3
Fig 3. A schematic of a simple intra-host demographic model potentially underlying infection dynamics.
At the time of infection, the virus population will initially be characterized by a population bottleneck associated with the founder event. A successful infection will next be characterized by rapid population growth associated with high viral loads, and reducing sizes and loads as the patient begins to clear the infection. The details of this infection history will greatly shape the observed levels and patterns of intra-host diversity. Viral load schematic modified from [134].
Fig 4
Fig 4. An example of clinical sampling of a patient over the course of an infection, demonstrating how a consensus sequence-based summary neglects the great majority of intra-host variants (which are expected to primarily segregate at low frequencies [shown as blue circles]).
This ascertainment of high-frequency intra-host variants (shown as red circles) for subsequent inter-host comparison thus represents an unfortunate and unnecessary loss of information.

References

    1. Worobey M. Dissecting the early COVID-19 cases in Wuhan. Science. 2021;374:1202–1204. doi: 10.1126/science.abm4454 - DOI - PubMed
    1. COVID-19 Excess Mortality Collaborators. Estimating excess mortality due to the COVID-19 pandemic: a systematic analysis of COVID-19-related mortality, 2020–21. Lancet (London, England). 2022;399:1513–1536. doi: 10.1016/S0140-6736(21)02796-3 - DOI - PMC - PubMed
    1. Weigang S, Fuchs J, Zimmer G, Schnepf D, Kern L, Beer J, et al.. Within-host evolution of SARS-CoV-2 in an immunosuppressed COVID-19 patient as a source of immune escape variants. Nat Commun. 2021;12:6405. doi: 10.1038/s41467-021-26602-3 - DOI - PMC - PubMed
    1. Jensen JD, Stikeleather RA, Kowalik TF, Lynch M. Imposed mutational meltdown as an antiviral strategy. Evolution. 2020;74:2549–2559. doi: 10.1111/evo.14107 - DOI - PMC - PubMed
    1. Bank C, Ewing GB, Ferrer-Admettla A, Foll M, Jensen JD. Thinking too positive? Revisiting current methods of population genetic selection inference. Trends Genet. 2014;30:540–546. doi: 10.1016/j.tig.2014.09.010 - DOI - PubMed

Publication types