Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 9;16(1):111.
doi: 10.1186/s13073-024-01380-x.

Evaluating metagenomics and targeted approaches for diagnosis and surveillance of viruses

Affiliations

Evaluating metagenomics and targeted approaches for diagnosis and surveillance of viruses

Sarah Buddle et al. Genome Med. .

Abstract

Background: Metagenomics is a powerful approach for the detection of unknown and novel pathogens. Workflows based on Illumina short-read sequencing are becoming established in diagnostic laboratories. However, high sequencing depth requirements, long turnaround times, and limited sensitivity hinder broader adoption. We investigated whether we could overcome these limitations using protocols based on untargeted sequencing with Oxford Nanopore Technologies (ONT), which offers real-time data acquisition and analysis, or a targeted panel approach, which allows the selective sequencing of known pathogens and could improve sensitivity.

Methods: We evaluated detection of viruses with readily available untargeted metagenomic workflows using Illumina and ONT, and an Illumina-based enrichment approach using the Twist Bioscience Comprehensive Viral Research Panel (CVRP), which targets 3153 viruses. We tested samples consisting of a dilution series of a six-virus mock community in a human DNA/RNA background, designed to resemble clinical specimens with low microbial abundance and high host content. Protocols were designed to retain the host transcriptome, since this could help confirm the absence of infectious agents. We further compared the performance of commonly used taxonomic classifiers.

Results: Capture with the Twist CVRP increased sensitivity by at least 10-100-fold over untargeted sequencing, making it suitable for the detection of low viral loads (60 genome copies per ml (gc/ml)), but additional methods may be needed in a diagnostic setting to detect untargeted organisms. While untargeted ONT had good sensitivity at high viral loads (60,000 gc/ml), at lower viral loads (600-6000 gc/ml), longer and more costly sequencing runs would be required to achieve sensitivities comparable to the untargeted Illumina protocol. Untargeted ONT provided better specificity than untargeted Illumina sequencing. However, the application of robust thresholds standardized results between taxonomic classifiers. Host gene expression analysis is optimal with untargeted Illumina sequencing but possible with both the CVRP and ONT.

Conclusions: Metagenomics has the potential to become standard-of-care in diagnostics and is a powerful tool for the discovery of emerging pathogens. Untargeted Illumina and ONT metagenomics and capture with the Twist CVRP have different advantages with respect to sensitivity, specificity, turnaround time and cost, and the optimal method will depend on the clinical context.

Keywords: Clinical metagenomics; Epidemiological surveillance; Next-generation sequencing; Pathogen detection; Viral diagnostics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Metagenomic sequencing and experimental outline. A Overview of a typical clinical metagenomic processing pipeline. B Flow chart summarizing experimental design, which involves inputting mock and clinical samples into three metagenomic workflows: Illumina DNA and RNA seq using NEBNext and KAPA kits respectively, ONT DNA and RNA seq using the Rapid PCR barcoding kit and the Rapid Smart-9N method respectively, and finally the targeted DNA- and RNA-based Twist viral research panel, sequenced on the Illumina platform. The resulting data was analysed using different taxonomic classifiers. Produced with biorender.com
Fig. 2
Fig. 2
Detection of mock community viruses.  Coverage and base pairs aligned to the six expected viral species in mock samples, by untargeted Illumina and ONT sequencing and capture probe enrichment with the Twist Bioscience Comprehensive Viral Research Panel followed by Illumina sequencing. A Percentage genome coverage at depth 1 × of species in mock community. B log10(bases) aligning to reference genome. Samples where a virus was detected in the full dataset but not the subsampled dataset are indicated with a *. Genome copy numbers refer to an average across the viral species—see Table S1. Each point shows the mean of at least two technical replicates—error bars show the range. PCR duplicate reads removed
Fig. 3
Fig. 3
Sensitivity and number of false positive species identified by taxonomic classifiers. A Sensitivity to the species in the mock community before and after the application of thresholds in the legend and further defined in the Supplementary information, for seven different taxonomic classifiers, by untargeted Illumina and ONT sequencing and capture probe enrichment with the Twist Bioscience Comprehensive Viral Research Panel followed by Illumina sequencing. MEGAN-LR and the One Codex Twist report are only designed for ONT and Twist sequencing respectively so were only run for these platforms. B, C Number of false positive species, defined as a species that is classified as positive but not present in the mock community. B False positive species from the raw output of the taxonomic classifiers with no thresholds applied. C Comparison of the numbers of viral positive species identified before and after the application of thresholds. RPMR: reads per million ratio, PMR: proportion of (nonhuman classified) microbial reads—see Supplementary Information for further details. Genome copy numbers refer to an average across the viral species—see Table S1. Each bar shows the mean of at least two technical replicates
Fig. 4
Fig. 4
Host transcriptomic analysis. AD Read counts per million assigned to each gene in the human genome by untargeted Illumina, untargeted ONT and targeted Illumina sequencing using the Twist Viral Research Panel. Each point represents a gene . AC raw reads; D only reads that map across splice junctions. E Total counts for spliced and other reads. F Number of genes identified by each pair of technologies. G Counts per million of reads by platform. Each panel shows the log2(CPM) as estimated by a different technology. Outliers not shown. All comparisons are statistically significant (p < 0.01) with a pairwise Wilcox test other than those indicated
Fig. 5
Fig. 5
Turnaround times and output data volumes. A Time taken for library preparation for the different protocols tested. The Twist panel uses a combined DNA and RNA-Seq protocol. The DNA + RNA bars for the untargeted sequencing indicate the time taken if both protocols are performed by a single operator. B Total cost (including library preparation) to sequence number of samples indicated plus single negative control, to a depth of 5 GB. ONT costs are shown with 48- and 72-h maximum run times per flow cell. C Volume of data output by time for a range of Illumina sequencing kits and ONT sequencing with PromethION flow cells. The Illumina kits produce a set amount of data after the sequencing run is complete—this is shown by pink dots. In ONT sequencing, data is output continuously and the run can be stopped at any time, until the flow cell becomes degraded. PromethION data (green/blue dotted lines) shows the average of our RNA and DNA-Seq runs, passed reads only. Data outputs for Illumina were obtained from the product specification data as of April 2024

References

    1. Wu F, Zhao S, Yu B, Chen YM, Wang W, Song ZG, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579(7798):265–9. - DOI - PMC - PubMed
    1. Palacios G, Druce J, Du L, Tran T, Birch C, Briese T, et al. A new arenavirus in a cluster of fatal transplant-associated diseases. N Engl J Med. 2008;358(10):991–8. - DOI - PubMed
    1. Quan PL, Wagner TA, Briese T, Torgerson TR, Hornig M, Tashmukhamedova A, et al. Astrovirus encephalitis in boy with X-linked Agammaglobulinemia. Emerg Infect Dis. 2010;16(6):918–25. - DOI - PMC - PubMed
    1. Wilson MR, Naccache SN, Samayoa E, Biagtan M, Bashir H, Yu G, et al. Actionable diagnosis of neuroleptospirosis by next-generation sequencing. N Engl J Med. 2014;370(25):2408–17. - DOI - PMC - PubMed
    1. Naccache SN, Peggs KS, Mattes FM, Phadke R, Garson JA, Grant P, et al. Diagnosis of neuroinvasive astrovirus infection in an immunocompromised adult with encephalitis by unbiased next-generation sequencing. Clin Infect Dis. 2015;60(6):919–23. - DOI - PMC - PubMed

LinkOut - more resources