Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 30;12(1):57.
doi: 10.1186/s13073-020-00751-4.

Multiple approaches for massively parallel sequencing of SARS-CoV-2 genomes directly from clinical samples

Affiliations

Multiple approaches for massively parallel sequencing of SARS-CoV-2 genomes directly from clinical samples

Minfeng Xiao et al. Genome Med. .

Abstract

Background: COVID-19 (coronavirus disease 2019) has caused a major epidemic worldwide; however, much is yet to be known about the epidemiology and evolution of the virus partly due to the scarcity of full-length SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) genomes reported. One reason is that the challenges underneath sequencing SARS-CoV-2 directly from clinical samples have not been completely tackled, i.e., sequencing samples with low viral load often results in insufficient viral reads for analyses.

Methods: We applied a novel multiplex PCR amplicon (amplicon)-based and hybrid capture (capture)-based sequencing, as well as ultra-high-throughput metatranscriptomic (meta) sequencing in retrieving complete genomes, inter-individual and intra-individual variations of SARS-CoV-2 from serials dilutions of a cultured isolate, and eight clinical samples covering a range of sample types and viral loads. We also examined and compared the sensitivity, accuracy, and other characteristics of these approaches in a comprehensive manner.

Results: We demonstrated that both amplicon and capture methods efficiently enriched SARS-CoV-2 content from clinical samples, while the enrichment efficiency of amplicon outran that of capture in more challenging samples. We found that capture was not as accurate as meta and amplicon in identifying between-sample variations, whereas amplicon method was not as accurate as the other two in investigating within-sample variations, suggesting amplicon sequencing was not suitable for studying virus-host interactions and viral transmission that heavily rely on intra-host dynamics. We illustrated that meta uncovered rich genetic information in the clinical samples besides SARS-CoV-2, providing references for clinical diagnostics and therapeutics. Taken all factors above and cost-effectiveness into consideration, we proposed guidance for how to choose sequencing strategy for SARS-CoV-2 under different situations.

Conclusions: This is, to the best of our knowledge, the first work systematically investigating inter- and intra-individual variations of SARS-CoV-2 using amplicon- and capture-based whole-genome sequencing, as well as the first comparative study among multiple approaches. Our work offers practical solutions for genome sequencing and analyses of SARS-CoV-2 and other emerging viruses.

Keywords: COVID-19; Emerging infectious diseases; Genomic surveillance; Hybrid capture; Metatranscriptomic sequencing; Multiplex PCR; Quasispecies; Virus evolution; iSNV.

PubMed Disclaimer

Conflict of interest statement

L.Y., Y.Z., F.C., and X.X. have applied for a patent relating to the amplicon-based method, and the details can be found below:

PCR primer pair and application thereof

Patent applicant: MGI Tech Co., Ltd

Name of inventor(s): Lin Yang, Ya Gao, Guodong Huang, Yicong Wang, Yuqian Wang, Yanyan Zhang, Fang Chen, Na Zhong, Hui Jiang, Xun Xu

Application number: PCT/CN2017/089195

The remaining authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
The general workflow of multiple sequencing approaches adopted in this study. We employed unique dual indexing (UDI) strategy and DNB-based (DNA nanoball) PCR-free MPS platform to minimize index hopping and relevant sequencing errors [–43]. a Amplicon-based enrichment: the UDI was integrated in the 2nd PCR. Navy, multiplex PCR primers. b Metatranscriptomic library preparations: the UDI was integrated in the adaptor ligation and universal PCR steps. c Library preparations and hybrid capture-based enrichment: the UDI was integrated in the adaptor ligation and pre-capture PCR steps. Ocher, ssDNA probes. Red and green lines represent adaptor sequences; green dots represent phosphate groups
Fig. 2
Fig. 2
Overview of the study design. Eight clinical samples and serial dilutions of a cultured isolate were subjected to direct metatranscriptomic library construction, amplicon-based enrichment, and hybrid capture-based enrichment, respectively. Libraries generated from each method were pooled, respectively. DNB, DNA nanoball. 14, GZMU0014; 16, GZMU0016; 30, GZMU0030; 31, GZMU0031; 42, GZMU0042; 44, GZMU0044; 47, GZMU0047; 48, GZMU0048. D0, undiluted sample of the cultured isolate; D1–D7, seven serial diluted samples of the cultured isolate, ranging from 1E+07 to 1E+01 genome copies per milliliter, in 10-fold dilution. “-”, negative controls prepared from nuclease-free water and human nucleic acids. PE100, paired-end 100-nt reads; SE400, single-end 100-nt reads
Fig. 3
Fig. 3
Sequencing coverage and depth of the cultured isolate and eight clinical samples. a Amplicon sequencing coverage by sample (row) across the SARS-CoV-2 genome. Dark blue, sequencing depth ≥ 100×; heatmap (bottom) sums coverage across all samples. HNA, negative control prepared from human nucleic acids; water, negative control prepared from nuclease-free water. Green horizontal lines on heatmap, amplicon locations. Overlap regions between amplicons range from 59 to 209 bp. bd Normalized coverage across viral genomes of the clinical samples across methods. e SARS-CoV-2-RPM sequence plotted against genome copies per milliliter for the cultured isolate. Three independent experiments were performed for amplicon sequencing. Dark blue, ~ 400 bp amplicon-based sequencing including human and lambda phage nucleic acid background; soft blue, ~ 200 bp amplicon-based sequencing; fluorescent cyan, ~ 400 bp amplicon-based sequencing excluding human and lambda phage nucleic acid background (NAB); red, capture sequencing; grey, meta sequencing. f SARS-CoV-2-RPM (reads per million) sequence plotted against qRT-PCR Ct value for the clinical samples. Dark blue, amplicon; red, capture; grey, meta. g Estimated minimum amount of bases required by each method for high-confidence downstream analyses. Dark blue, amplicon; red, capture
Fig. 4
Fig. 4
Between-sample and within-sample variants of SARS-CoV-2 detected across methods. a SNVs detected between clinical samples against a reference genome (GISAID accession: EPI_ISL_402119) [27]. Alleles with ≥ 80% frequencies were called. *SNVs verified by Sanger sequencing. b Allele frequencies of the identified SNVs. Dark blue, amplicon; red, capture; grey, meta. Minor allele frequencies detected in serial dilutions of the cultured isolate (c) and clinical samples (d) across methods. Dark blue, amplicon vs meta; red, capture vs meta. Minor alleles are defined with ≥ 5% and < 50% frequencies. Besides general quality filter, iSNVs had to pass depth and strand bias filter as described in the “Methods” section

References

    1. WHO: Coronavirus disease (COVID-2019) situation report - 54. World Health Organization; 2020.
    1. Dudas G, Carvalho LM, Bedford T, Tatem AJ, Baele G, Faria NR, Park DJ, Ladner JT, Arias A, Asogun D, et al. Virus genomes reveal factors that spread and sustained the Ebola epidemic. Nature. 2017;544:309–315. doi: 10.1038/nature22040. - DOI - PMC - PubMed
    1. Dudas G, Carvalho LM, Rambaut A, Bedford T. Correction: MERS-CoV spillover at the camel-human interface. Elife. 2018;7. - PMC - PubMed
    1. Gardy JL, Loman NJ. Towards a genomics-informed, real-time, global pathogen surveillance system. Nat Rev Genet. 2018;19:9–20. doi: 10.1038/nrg.2017.88. - DOI - PMC - PubMed
    1. Gire SK, Goba A, Andersen KG, Sealfon RS, Park DJ, Kanneh L, Jalloh S, Momoh M, Fullah M, Dudas G, et al. Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak. Science. 2014;345:1369–1372. doi: 10.1126/science.1259657. - DOI - PMC - PubMed

Publication types

LinkOut - more resources