Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr;31(4):635-644.
doi: 10.1101/gr.268961.120. Epub 2021 Feb 18.

SARS-CoV-2 genomic diversity and the implications for qRT-PCR diagnostics and transmission

Affiliations

SARS-CoV-2 genomic diversity and the implications for qRT-PCR diagnostics and transmission

Nicolae Sapoval et al. Genome Res. 2021 Apr.

Abstract

The COVID-19 pandemic has sparked an urgent need to uncover the underlying biology of this devastating disease. Though RNA viruses mutate more rapidly than DNA viruses, there are a relatively small number of single nucleotide polymorphisms (SNPs) that differentiate the main SARS-CoV-2 lineages that have spread throughout the world. In this study, we investigated 129 RNA-seq data sets and 6928 consensus genomes to contrast the intra-host and inter-host diversity of SARS-CoV-2. Our analyses yielded three major observations. First, the mutational profile of SARS-CoV-2 highlights intra-host single nucleotide variant (iSNV) and SNP similarity, albeit with differences in C > U changes. Second, iSNV and SNP patterns in SARS-CoV-2 are more similar to MERS-CoV than SARS-CoV-1. Third, a significant fraction of insertions and deletions contribute to the genetic diversity of SARS-CoV-2. Altogether, our findings provide insight into SARS-CoV-2 genomic diversity, inform the design of detection tests, and highlight the potential of iSNVs for tracking the transmission of SARS-CoV-2.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Overview of general diversity of SARS-CoV-2. From outer to inner layers: Annotation of SARS-CoV-2 genome (gray), PCR primer designs (dark red), transcription-regulating sequences (TRS) (orange), intra-host variant density including iSNVs (blue), deletions start sites (red), duplication start sites (yellow), and inversion start sites (green) along the entire genome. For SNPs + iSNVs + SVs, we plotted the density scaled by their allele frequency across the population over 100-bp windows.
Figure 2.
Figure 2.
Mutational frequencies of iSNVs and SNPs. (A) Distribution of iSNV AF. We note that the distribution of AF is strictly <50% as iSNVs are below consensus-level by definition. (B) Mutational spectrum of SARS-CoV-2. (C) Mutational spectra of SARS-CoV-1, SARS-CoV-2, and MERS. (D) Mutational spectrum of SARS-CoV-2 by orf/nsp.
Figure 3.
Figure 3.
Shared SNPs and SNVs across data sets. (A) Illustration differentiating what we define as an intra-host SNV (iSNV) and an inter-host consensus-level SNP. (B) UpSet plot captures the shared single nucleotide variants between iSNVs and consensus-level SNPs. The horizontal bars on the left show the total number of variants in the given category. Vertical bars indicate the size of the intersection between highlighted (with black circles) sets. Every variant contributes to exactly one intersection size to avoid double counting.
Figure 4.
Figure 4.
iSNV and SNP presence on widely used primers and probes. This figure shows the locations on WHO probes and primers that contain SNPs (left) and iSNVs (right). Columns correspond to base pair positions within the probe, and the sequences are 5′-aligned. Rows correspond to the oligonucleotide sequences, and squares are highlighted based on how many samples/genomes contain a variant in that position.
Figure 5.
Figure 5.
In-depth analysis of shared iSNVs. (A) Paired samples from patient COVSUBJ 9 in NYC. (B) Paired samples from patient COVSUBJ 0639 in NYC. (C) The distribution of the number of genomic pairs and their shared variants. (D) The number of pairs with variants at given nucleotide positions. Red color represents positions that were shown to be highly homoplasic and more likely to be affected by error (De Maio et al. 2020).

Update of

Similar articles

Cited by

References

    1. Barbezange C, Jones L, Blanc H, Isakov O, Celniker G, Enouf V, Shomron N, Vignuzzi M, van der Werf S. 2018. Seasonal genetic drift of human influenza A virus quasispecies revealed by deep sequencing. Front Microbiol 9: 2596. 10.3389/fmicb.2018.02596 - DOI - PMC - PubMed
    1. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114–2120. 10.1093/bioinformatics/btu170 - DOI - PMC - PubMed
    1. Borucki MK, Collette NM, Coffey LL, Rompay KKAV, Hwang MH, Thissen JB, Allen JE, Zemla AT. 2019. Multiscale analysis for patterns of Zika virus genotype emergence, spread, and consequence. PLoS One 14: e0225699. 10.1371/journal.pone.0225699 - DOI - PMC - PubMed
    1. Butler D, Mozsary C, Meydan C, Foox J, Rosiene J, Shaiber A, Danko D, Afshinnekoo E, MacKay M, Sedlazeck FJ, et al. 2021. Shotgun transcriptome, spatial omics, and isothermal profiling of SARS-CoV-2 infection reveals unique host responses, viral diversification, and drug interactions. Nat Commun 12: 1660. 10.1038/s41467-021-21361-7 - DOI - PMC - PubMed
    1. Chen X, Schulz-Trieglaff O, Shaw R, Barnes B, Schlesinger F, Källberg M, Cox AJ, Kruglyak S, Saunders CT. 2016. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32: 1220–1222. 10.1093/bioinformatics/btv710 - DOI - PubMed

Publication types