Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2021 Mar 2:2020.10.12.336644.
doi: 10.1101/2020.10.12.336644.

Ongoing Global and Regional Adaptive Evolution of SARS-CoV-2

Affiliations

Ongoing Global and Regional Adaptive Evolution of SARS-CoV-2

Nash D Rochman et al. bioRxiv. .

Update in

Abstract

Understanding the trends in SARS-CoV-2 evolution is paramount to control the COVID-19 pandemic. We analyzed more than 300,000 high quality genome sequences of SARS-CoV-2 variants available as of January 2021. The results show that the ongoing evolution of SARS-CoV-2 during the pandemic is characterized primarily by purifying selection, but a small set of sites appear to evolve under positive selection. The receptor-binding domain of the spike protein and the nuclear localization signal (NLS) associated region of the nucleocapsid protein are enriched with positively selected amino acid replacements. These replacements form a strongly connected network of apparent epistatic interactions and are signatures of major partitions in the SARS-CoV-2 phylogeny. Virus diversity within each geographic region has been steadily growing for the entirety of the pandemic, but analysis of the phylogenetic distances between pairs of regions reveals four distinct periods based on global partitioning of the tree and the emergence of key mutations. The initial period of rapid diversification into region-specific phylogenies that ended in February 2020 was followed by a major extinction event and global homogenization concomitant with the spread of D614G in the spike protein, ending in March 2020. The NLS associated variants across multiple partitions rose to global prominence in March-July, during a period of stasis in terms of inter-regional diversity. Finally, beginning July 2020, multiple mutations, some of which have since been demonstrated to enable antibody evasion, began to emerge associated with ongoing regional diversification, which might be indicative of speciation.

Keywords: SARS-Cov-2; ancestral reconstruction; epistasis; globalization; phylogeny.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Evolution of SARS-CoV-2.
A. Global tree reconstruction with 8 principal partitions and 3 variant clades enumerated and color-coded. B. Signatures of amino acid replacements for each partition. Sites are ordered as they appear in the genome. The proteins along with nucleotide and amino acid numbers are indicated underneath each column. C. Site history trees for spike 614 and nucleocapsid 203 positions. Nodes were included in this reduced tree based on the following criteria: those immediately succeeding a substitution; representing the last common ancestor of at least two substitutions; or terminal nodes representing branches of five sequences or more (approximately, based on tree weight). Edges are colored according to their position in the main partitions and the line type corresponds to the target mutation (solid) or any other state (dashed). Synonymous mutations are not shown. These sites are largely binary as are most sites in the genome. The sizes of the terminal node sizes are proportional to the log of the weight descendent from that node beyond which no substitutions in the site occurred. Node color corresponds to target mutation (black) or any other state (gray). D. Network of putative epistatic interactions for likely positively selected residues in the N and S proteins.
Figure 2.
Figure 2.. Regional SARS-CoV-2 partition dynamics during the COVID-19 pandemic.
Probability distributions shown, for the absolute number of sequences, see Fig. S15.
Figure 3.
Figure 3.. Global and regional trends in SARS-CoV-2 evolution.
A. Global distribution of sequences with sequencing locations in each of the six regions considered. Color scheme is for visual distinction only. B. Intra-regional diversity measured by the mean tree-distance for pairs of isolates. C. (Top) The Hellinger distance for all pairs of regions over the 11 partition/clade distribution. 25th, 50th, 75th percentiles shown. (Bottom) The ratio of the mean tree-distance for pairs of isolates between regions vs. isolates within regions. 25th, 50th, 75th percentiles shown. D. The frequency of S|614G, at least one NLS-associated variant (N|194L, N119L, N203K, N205I, and N220V), and at least one emerging spike variant (Fig. S23, excluding S|477N).

References

    1. Drake JW & Holland JJ (1999) Mutation rates among RNA viruses. Proceedings of the National Academy of Sciences 96(24):13910–13913. - PMC - PubMed
    1. Sanjuán R (2012) From molecular genetics to phylodynamics: evolutionary relevance of mutation rates across viruses. PLoS Pathog 8(5):e1002685. - PMC - PubMed
    1. Simmonds P, Aiewsakun P, & Katzourakis A (2019) Prisoners of war - host adaptation and its constraints on virus evolution. Nat Rev Microbiol 17(5):321–328. - PMC - PubMed
    1. Elena SF & Sanjuán R (2005) Adaptive value of high mutation rates of RNA viruses: separating causes from consequences. J Virol 79(18):11555–11558. - PMC - PubMed
    1. Wertheim JO & Kosakovsky Pond SL (2011) Purifying selection can obscure the ancient age of viral lineages. Molecular biology and evolution 28(12):3355–3365. - PMC - PubMed

Publication types