Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 17;25(6):104487.
doi: 10.1016/j.isci.2022.104487. Epub 2022 May 30.

Early detection and improved genomic surveillance of SARS-CoV-2 variants from deep sequencing data

Affiliations

Early detection and improved genomic surveillance of SARS-CoV-2 variants from deep sequencing data

Daniele Ramazzotti et al. iScience. .

Abstract

A key task of genomic surveillance of infectious viral diseases lies in the early detection of dangerous variants. Unexpected help to this end is provided by the analysis of deep sequencing data of viral samples, which are typically discarded after creating consensus sequences. Such analysis allows one to detect intra-host low-frequency mutations, which are a footprint of mutational processes underlying the origination of new variants. Their timely identification may improve public-health decision-making with respect to traditional approaches exploiting consensus sequences. We present the analysis of 220,788 high-quality deep sequencing SARS-CoV-2 samples, showing that many spike and nucleocapsid mutations of interest associated to the most circulating variants, including Beta, Delta, and Omicron, might have been intercepted several months in advance. Furthermore, we show that a refined genomic surveillance system leveraging deep sequencing data might allow one to pinpoint emerging mutation patterns, providing an automated data-driven support to virologists and epidemiologists.

Keywords: bioinformatics; genomic analysis; microbiology; virology.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
SARS-CoV-2 samples in GISAID and NCBI public repositories Number of SARS-CoV-2 samples for which either deep sequencing data or consensus sequences are available, grouped by month in which the related dataset is released in the period January 2020–August 2021. Source databases are NCBI (National Center for Biotechnology Information, 2021) for deep sequencing data and GISAID (Shu and McCauley, 2017) for consensus sequences (update: August 2021).
Figure 2
Figure 2
Early detection of 6 SMoIs associated to hazardous variants from deep sequencing data Analysis of SMoIs: S:L18F, S:Q414K, S:L452R, S:T478K, S:H655Y, and S:A701V (see Table 1). Circles with purple borders mark the first month in which the mutation was detected as minor (MF 5% and <50%) in at least 5 samples, while been still undetected as fixed (MF 50%); circles with blue borders mark the month in which the mutation was first detected as fixed in at least 1 sample; red lines highlight the anticipation (when >1 months). The analysis is performed by splitting the samples in the 6 distinct geographical regions and by reporting the corresponding results at the global scale. All circles contain a pie-chart that displays the ratio of samples showing that mutation either as minor or as fixed in that month (further details are provided in the main text). For each SMoI the related variants are also reported.
Figure 3
Figure 3
Mutant frequency and prevalence variation in time of SMoIs S:L452R and S:H655Y The leftmost panels return the distribution of the mutation frequency (MF) of all samples with SMoIs S:L452R (upper panels) and S:H655Y (lower), grouped by month and geographical region. Each cell shows the proportion of samples showing the mutation with that specific MF. The rightmost panels show the number of samples showing the mutations either as minor (MF 5% and <50%) or as fixed (MF 50%). The lineages associated to both variants are also displayed.
Figure 4
Figure 4
Early detection of 6 S mutations not associated to known variants Analysis of 6 S mutations originally detected as minor (in at least 5 samples) and only successively as fixed at the global scale, namely, S:W152C, S:S297L, S:C361S, S:G446V, S:A570D, and S:T791K. For further details, please refer to the caption of Figure 2. S mutations first detected as minor at the local scale are shown in Figure S1 in the supplementary information.
Figure 5
Figure 5
Early detection of N mutations Analysis of NMoI N:D377Y and of the three highly diffused N mutations originally detected as minor (in at least 5 samples) and only successively as fixed at the global scale, namely, N:L219F, N:A254S, and N:A254V. For further details please refer to the caption of Figure 2. N mutations first detected as minor at the local scale are shown in Figure S2 in the supplementary information.
Figure 6
Figure 6
Analysis of homoplastic minor variants (A–D) The heatmaps show the prevalence (i.e., number of samples over the total) of the SMoIs (panel A), additional highly diffused S mutations (B), the NMoIs (C), and the additional highly diffused N mutations (D) retrieved as minor (MF >5% and 50%) in the samples associated to the variants of Table 1 via Pangolin (O’Toole et al., 2021a, 2021b). Only the mutations observed in at least 1% of the samples of any variant are shown.

Similar articles

Cited by

References

    1. Andersen K.G., Rambaut A., Lipkin W.I., Holmes E.C., Garry R.F. The proximal origin of SARS-CoV-2. Nat. Med. 2020;26:450–452. doi: 10.1038/s41591-020-0820-9. - DOI - PMC - PubMed
    1. Bandelt H.J., Quintana-Murci L., Salas A., Macaulay V. The fingerprint of phantom mutations in mitochondrial DNA data. Am. J. Hum. Genet. 2002;71:1150–1160. doi: 10.1086/344397. - DOI - PMC - PubMed
    1. Bastola A., Sah R., Rodriguez-Morales A.J., Lal B.K., Jha R., Ojha H.C., Shrestha B., Chu D.K.W., Poon L.L.M., Costello A., et al. The first 2019 novel coronavirus case in Nepal. Lancet Infect. Dis. 2020;20:279–280. doi: 10.1016/s1473-3099(20)30067-0. - DOI - PMC - PubMed
    1. Baud D., Qi X., Nielsen-Saines K., Musso D., Pomar L., Favre G. Real estimates of mortality following covid-19 infection. Lancet Infect. Dis. 2020;20:773. doi: 10.1016/s1473-3099(20)30195-x. - DOI - PMC - PubMed
    1. Beerenwinkel N., Zagordi O. Ultra-deep sequencing for the analysis of viral populations. Curr. Opin. Virol. 2011;1:413–418. doi: 10.1016/j.coviro.2011.07.008. - DOI - PubMed

LinkOut - more resources