Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 12;15(1):9016.
doi: 10.1038/s41467-024-51470-y.

Laboratory validation of a clinical metagenomic next-generation sequencing assay for respiratory virus detection and discovery

Affiliations

Laboratory validation of a clinical metagenomic next-generation sequencing assay for respiratory virus detection and discovery

Jessica Karielle Tan et al. Nat Commun. .

Abstract

Tools for rapid identification of novel and/or emerging viruses are urgently needed for clinical diagnosis of unexplained infections and pandemic preparedness. Here we developed and clinically validated a largely automated metagenomic next-generation sequencing (mNGS) assay for agnostic detection of respiratory viral pathogens from upper respiratory swab and bronchoalveolar lavage samples in <24 h. The mNGS assay achieved mean limits of detection of 543 copies/mL, viral load quantification with 100% linearity, and 93.6% sensitivity, 93.8% specificity, and 93.7% accuracy compared to gold-standard clinical multiplex RT-PCR testing. Performance increased to 97.9% overall predictive agreement after discrepancy testing and clinical adjudication, which was superior to that of RT-PCR (95.0% agreement). To enable discovery of novel, sequence-divergent human viruses with pandemic potential, de novo assembly and translated nucleotide algorithms were incorporated into the automated SURPI+ computational pipeline used by the mNGS assay for pathogen detection. Using in silico analysis, we showed that after removal of all human viral sequences from the reference database, 70 (100%) of 70 representative human viral pathogens could still be identified based on homology to related animal or plant viruses. Our assay, which was granted breakthrough device designation from the US Food and Drug Administration (FDA) in August of 2023, demonstrates the feasibility of routine mNGS testing in clinical and public health laboratories, thus facilitating a robust and rapid response to the next viral pandemic.

PubMed Disclaimer

Conflict of interest statement

Competing interests C.Y.C. is a founder of Delve Bio and on the scientific advisory board for Delve Bio, Flightpath Biosciences, Biomeme, Mammoth Biosciences, BiomeSense and Poppy Health. He is also an inventor on US patent 11380421, “Pathogen detection using next generation sequencing”, under which algorithms for taxonomic classification, filtering, and pathogen detection are used by SURPI+ software. C.Y.C. receives research support from Delve Bio and Abbott Laboratories, Inc. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Schematic of the mNGS assay workflow.
A RNA from respiratory samples is extracted and treated with DNase. Internal control is added to assess human background during sequencing. Human rRNA is depleted during cDNA synthesis. Libraries are generated on the automated Tecan MagicPrep NGS instrument. Libraries are normalized, pooled, and loaded onto the sequencer. B Sequences are processed using SURPI+ software for alignment and classification. Reads are preprocessed by trimming of adapters and removal of low-quality/low-complexity sequences, followed by computational subtraction of human reads. Reads are mapped to the closest matched genome to identify non-overlapping regions using NCBI GenBank and FDA-ARGOS database. To aid in analysis, automated result summaries, heat maps of both raw and normalized read counts, and coverage and pairwise identity plots are generated for clinical interpretation. Total turnaround time is between 14 and 22 h depending on type of sequencer used.
Fig. 2
Fig. 2. Enhancements to the SURPI+ bioinformatics pipeline for pathogen identification.
A Schematic diagram of modifications made to the SURPI+ bioinformatics pipeline to enhance its pathogen detection capabilities. The modifications include (1) calculation of the estimated viral load for each detected virus in the sample using a quantitative internal spiked ERCC control (top row), (2) incorporation of reference-grade databases such as the FDA-ARGOS database by”tagging” of GenBank accession numbers in the SURPI+ database (middle row), and (3) identification of novel, sequence-divergent viruses using de novo viral genome assembly and translated nucleotide (amino acid) alignments to a viral protein database (bottom row). B Pairwise and overall comparisons of viral load medians among groups stratified by severity: asymptomatic (n = 24), mild (n = 53), moderately (n = 20), or severe (n = 8) respiratory infection. For the box and whiskers plots, the solid line within each box represents the median log viral load, while the dashed line indicates the mean log viral load. The interquartile range (IQR) is shown by the height of the box, with whiskers extending to the minimum and maximum values within 1.5 times the IQR. Each point corresponds to a detected virus, with different colors representing different virus species or genera. Mann-Whitney U and Kruskal-Wallis H tests were used for pairwise and overall significance testing, respectively. All tests were two-sided with Bonferroni correction for multiple comparisons, and the significance level was set at 0.05.
Fig. 3
Fig. 3. Limits of detection (LoD) study.
Probit regression analysis curves plotting the viral titer in copies/mL (y-axis) against the calculated detection probability (x-axis) of (A) SARS-CoV-2, (B) influenza A, (C) influenza B and (D) respiratory syncytial virus (RSV). The regression curves and error bands (surrounding shaded areas) representing 95% confidence intervals for each curve as determined by probit regression analysis are shaded in a different color for each virus. The probability of detection corresponding to 95% is denoted with a blue circle for each virus. Probit analyses were done using Python software (version 3.7.12). Results show a LoD ranging from 439 to 706 copies/mL for the 4 respiratory viruses in the positive control.
Fig. 4
Fig. 4. Evaluation of linearity and viral load quantification for the mNGS assay.
A A quantified SARS-CoV-2 PCR-positive nasopharyngeal swab from a patient with COVID-19 was serially diluted in donor nasal swab matrix and tested across 4 log10 dilutions. B A quantified HCV PCR-positive plasma sample from a patient with hepatitis C infection was serially diluted in donor plasma and tested across 4 log10 dilutions. At each dilution, the calculated mean concentration from three replicates is plotted against the expected concentration on a log scale, and the R2 correlation coefficient is determined by linear regression.
Fig. 5
Fig. 5. Demonstration of inclusivity and clinical use cases for the mNGS assay.
A Genotyping of rhinovirus and enterovirus subtypes from PCR-positive nasal swab samples. Conventional clinical multiplex RT-PCR tests do not distinguish between rhinoviruses and enteroviruses, nor are they able to subtype more pathogenic strains such as rhinovirus C or enterovirus D68 in association with acute flaccid myelitis,. B Detection of uncommon or rare viral pathogens causing respiratory infections in critically ill mechanically ventilated hospitalized patients. The circles correspond to detected viruses and are color-coded by virus and scaled by read counts. For each detected virus, the read count is shown in the circle, while the identified genotype after SURPI+ pipeline is shown in the upper right quadrant. Abbreviations: ETA endotracheal aspirate.
Fig. 6
Fig. 6. Accuracy evaluation for the mNGS assay.
Pie charts and 2 × 2 contingency tables showing the distribution of detected viruses and performance metrics. A mNGS against RVP testing, (B) mNGS testing against DTCA, and (C) RVP testing against DTCA. RVP testing using FDA IVD assays includes detection of respiratory syncytial virus, parainfluenza viruses 1–3, metapneumovirus, rhinovirus/enterovirus, influenza A virus, influenza B virus, and adenovirus. Discrepant samples that were mNGS-positive/RVP-negative or mNGS-negative/RVP-positive underwent orthogonal testing by targeted virus-specific PCR at the state public health laboratory and medical chart review for the most likely diagnosis by clinical adjudication. Abbreviations: mNGS metagenomic next-generation Sequencing, PCR polymerase chain reaction, RVP viral respiratory panel, DTCA discrepancy testing and clinical adjudication, PPA positive percent agreement, NPA negative percent agreement, OPA overall percent agreement, RSV respiratory syncytial virus, FDA Food and Drug Administration, IVD in vitro diagnostic.
Fig. 7
Fig. 7. In-depth analysis of a rhinovirus C detection by mNGS that was discrepant with RT-PCR.
A A heat map generated from SURPI+ analysis shows 12 reads aligning to rhinovirus C from a single sample, excluding the possibility of cross-contamination. Each column denotes a clinical sample, while each row corresponds to a taxonomic identification at the species, genus, or family level. The asterisks refer to “declassification” of reads from one level to the next higher taxonomic level (for example, from species to genus). B A coverage map shows that the 12 reads span the genome of the most closely matched rhinovirus C genome in the reference database identified by SURPI+ (accession number MG148341.1) without overlap, with coverage of 19% of the ~7000 base pair (bp) genome. C Several mismatches in the primer and probe sequences from published RT-PCR assays targeting the 5’-untranslated region (5’-UTR) are observed when compared to the viral mNGS reads, providing a likely explanation for the discrepant mNGS and RT-PCR results. The four assays are labeled 1 through 4 and correspond to Lu et al. (1), Tapparel et al. (2), Gunson et al. (3), and Steininger et al. (4). The mismatched nucleotides are highlighted with a background colour. Note that the assay from Steininger, et al. does not include a probe. Abbreviations: FP forward primer, Pr probe, RP reverse primer.
Fig. 8
Fig. 8. In silico demonstration of novel, sequence-divergent virus detection using the mNGS assay.
A Representative viral reference genomes corresponding to outbreak viruses of clinical and public health significance with pandemic potential are retrieved from the NCBI GenBank database, partitioned into non-overlapping segments, and then randomly sampled and spiked in silico into a negative nasal swab matrix sequencing library. A higher-level set of taxonomic identifiers (species, genus, and/or family) corresponding to these viruses is removed from the SURPI+ reference dataset and the simulated sequencing file is analyzed using both the original and “restricted reference” databases. B Viruses can be detected using the modified SURPI+ pipeline despite lacking a taxonomic reference at levels down to 10–100 reads per million (RPM). Abbreviations: EEEV Eastern equine encephalitis virus, ERCC External RNA Controls Consortium, FDA-ARGOS FDA dAtabase for Reference Grade micrObial Sequences, HFV hemorrhagic fever virus, HIV human immunodeficiency virus, JCPyV JC polyomavirus, PC positive control, PyV polyomavirus, TSPyV trichodysplasia spinulosa polyomavirus, SURPI+ sequence-based ultrarapid pathogen identification, VEEV Venezuelan equine encephalitis virus, WEEV Western equine encephalitis virus.

Similar articles

Cited by

References

    1. DALYs, G. B. D. et al. Global, regional, and national disability-adjusted life years (DALYs) for 306 diseases and injuries and healthy life expectancy (HALE) for 188 countries, 1990-2013: quantifying the epidemiological transition. Lancet386, 2145–2191 (2015). - PMC - PubMed
    1. Jain, S. et al. Community-acquired pneumonia requiring hospitalization among U.S. adults. N. Engl. J. Med.373, 415–427 (2015). - PMC - PubMed
    1. Jain, S. et al. Community-acquired pneumonia requiring hospitalization among U.S. children. N. Engl. J. Med.372, 835–845 (2015). - PMC - PubMed
    1. Musher, D. M. & Thorner, A. R. Community-acquired pneumonia. N. Engl. J. Med.371, 1619–1628 (2014). - PubMed
    1. Charlton, C. L. et al. Practical guidance for clinical microbiology laboratories: viruses causing acute respiratory tract infections. Clin. Microbiol. Rev.32, 10.1128/CMR.00042-18 (2019). - PMC - PubMed

Publication types

MeSH terms