Evaluation of NGS-based approaches for SARS-CoV-2 whole genome characterisation

Caroline Charre^{1

2

3}, Christophe Ginevra^{4

5}, Marina Sabatier^{1

2

6}, Hadrien Regue^{1

2}, Grégory Destras^{1

2

6

7}, Solenne Brun^{1

7}, Gwendolyne Burfin^{1

7}, Caroline Scholtes^{1

2

3}, Florence Morfin^{1

2

6

7}, Martine Valette^{1

7}, Bruno Lina^{1

2

6

7}, Antonin Bal^{1

2

6

7}, Laurence Josset^{1

2

6

7}

Affiliations

¹ Laboratoire de Virologie, Institut des Agents Infectieux (IAI), Hospices Civils de Lyon, Groupement Hospitalier Nord, Lyon cedex 4, France.
² Université de Lyon, Université Claude Bernard Lyon 1 (UCBL1), Lyon, France.
³ Centre de recherche en cancérologie (CRCL), INSERM U1052- 69008, Lyon, France.
⁴ CIRI, Centre International de Recherche en Infectiologie, (Team pathogenesis of Legionella), Univ Lyon, Inserm, U1111, Université Claude Bernard Lyon 1, CNRS, UMR5308, ENS de Lyon, Lyon 69007, France.
⁵ Hospices Civils de Lyon, Centre National de Référence des Légionelles, Lyon, France.
⁶ Université de Lyon, Virpath, CIRI, INSERM U1111, CNRS UMR5308, ENS Lyon, Université Claude Bernard Lyon 1, Lyon, France.
⁷ Centre National de Référence France-Sud des Virus des Infections Respiratoires, Hospices Civils de Lyon, Groupement Hospitalier Nord, Lyon cedex 4, France.

PMID: 33318859
PMCID: PMC7665770
DOI: 10.1093/ve/veaa075

Evaluation of NGS-based approaches for SARS-CoV-2 whole genome characterisation

Caroline Charre et al. Virus Evol. 2020.

. 2020 Oct 5;6(2):veaa075.

doi: 10.1093/ve/veaa075. eCollection 2020 Jul.

Authors

Affiliations

¹ Laboratoire de Virologie, Institut des Agents Infectieux (IAI), Hospices Civils de Lyon, Groupement Hospitalier Nord, Lyon cedex 4, France.
² Université de Lyon, Université Claude Bernard Lyon 1 (UCBL1), Lyon, France.
³ Centre de recherche en cancérologie (CRCL), INSERM U1052- 69008, Lyon, France.
⁴ CIRI, Centre International de Recherche en Infectiologie, (Team pathogenesis of Legionella), Univ Lyon, Inserm, U1111, Université Claude Bernard Lyon 1, CNRS, UMR5308, ENS de Lyon, Lyon 69007, France.
⁵ Hospices Civils de Lyon, Centre National de Référence des Légionelles, Lyon, France.
⁶ Université de Lyon, Virpath, CIRI, INSERM U1111, CNRS UMR5308, ENS Lyon, Université Claude Bernard Lyon 1, Lyon, France.
⁷ Centre National de Référence France-Sud des Virus des Infections Respiratoires, Hospices Civils de Lyon, Groupement Hospitalier Nord, Lyon cedex 4, France.

PMID: 33318859
PMCID: PMC7665770
DOI: 10.1093/ve/veaa075

Abstract

Since the beginning of the COVID-19 outbreak, SARS-CoV-2 whole-genome sequencing (WGS) has been performed at unprecedented rate worldwide with the use of very diverse Next-Generation Sequencing (NGS) methods. Herein, we compare the performance of four NGS-based approaches for SARS-CoV-2 WGS. Twenty-four clinical respiratory samples with a large scale of Ct values (from 10.7 to 33.9) were sequenced with four methods. Three used Illumina sequencing: an in-house metagenomic NGS (mNGS) protocol and two newly commercialised kits including a hybridisation capture method developed by Illumina (DNA Prep with Enrichment kit and Respiratory Virus Oligo Panel, RVOP), and an amplicon sequencing method developed by Paragon Genomics (CleanPlex SARS-CoV-2 kit). We also evaluated the widely used amplicon sequencing protocol developed by ARTIC Network and combined with Oxford Nanopore Technologies (ONT) sequencing. All four methods yielded near-complete genomes (>99%) for high viral loads samples (n = 8), with mNGS and RVOP producing the most complete genomes. For mid viral loads (Ct 20-25), amplicon-based enrichment methods led to genome coverage >99 per cent for all samples while 1/8 sample sequenced with RVOP and 2/8 samples sequenced with mNGS had a genome coverage below 99 per cent. For low viral loads (Ct ≥25), amplicon-based enrichment methods were the most sensitive techniques. All methods were highly concordant in terms of identity in complete consensus sequence. Just one mismatch in three samples was observed in CleanPlex vs the other methods, due to the dedicated bioinformatics pipeline setting a high threshold to call SNP compared to reference sequence. Importantly, all methods correctly identified a newly observed 34nt-deletion in ORF6 but required specific bioinformatic validation for RVOP. Finally, as a major warning for targeted techniques, a loss of coverage in any given region of the genome should alert to a potential rearrangement or a SNP in primer-annealing or probe-hybridizing regions and would require further validation using unbiased metagenomic sequencing.

Keywords: COVID-19; SARS-CoV-2; genomic surveillance; next generation sequencing; whole-genome sequencing.

PubMed Disclaimer

Figures

**Figure 1.**
Plots of coverage according to evaluated methods and Cycle threshold (Ct) values groups. Dotted lines indicate the minimum depth of 10× for Illumina methods and 20× for ONT method. Missing sites in the genome are those with a coverage <10× for Illumina methods and <20× for the ARTIC-ONT method. Using an R script, these plots were constructed via ggplot2 from depth files generated by BEDtools from output aligned bam files of each specific-method pipeline.

**Figure 2.**
Mismatch count between consensus sequences generated by each method compared two by two for each sample. These matrices were generated only from consensus with determined bases for more than 99 per cent of the genome. If one sequence of the two had more than 1 per cent of undetermined bases (N), comparison was not assessed, grey tiles. Blue tiles correspond to perfect identity and orange tile correspond to mismatches, the number of mismatches is indicated inside the tile. Matrices were generated with an R script using Decipher (alignment), ape (distance matrices), and ggplot2 (charts) libraries. Of note, undetermined bases and deletions were not considered in the calculation of mismatches. * For the sample #19: the position 533 is undetermined by ARTIC method, and therefore no SNP is observed between ARTIC and CleanPlex methods.

See this image and copyright information in PMC

References

1. Bal A. et al. (2020) ‘Molecular Characterization of SARS-CoV-2 in the First COVID-19 Cluster in France Reveals an Amino Acid Deletion in nsp2 (Asp268del)’, Clinical Microbiology and Infection, 26: 960–2. - PMC - PubMed
1. Bal A. et al. (2018) ‘Quality Control Implementation for Universal Characterization of DNA and RNA Viruses in Clinical Respiratory Samples Using Single Metagenomic Next-Generation Sequencing Workflow’, BMC Infectious Diseases, 18: 1–10. - PMC - PubMed
1. Batty E. M. et al. (2020. a). ‘Genomic Surveillance of SARS-CoV-2 in Thailand Reveals Mixed Imported Populations, a Local Lineage Expansion and a Virus with Truncated ORF7a’, medRxiv. doi:10.1101/2020.05.22.20108498. - PMC - PubMed
1. Batty E. M. et al. (2020. b). ‘Comparing Library Preparation Methods for SARS-CoV-2 Multiplex Amplicon Sequencing on the Illumina MiSeq Platform’, bioRxiv. doi:10.1101/2020.06.16.154286.
1. Consortium (COG-UK) et al. (2020) ‘An Integrated National Scale SARS-CoV-2 Genomic Surveillance Network’, Lancet Microbe, 1: e99–e100. - PMC - PubMed

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Evaluation of NGS-based approaches for SARS-CoV-2 whole genome characterisation

Affiliations

Evaluation of NGS-based approaches for SARS-CoV-2 whole genome characterisation

Authors

Affiliations

Abstract

Figures

References

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous