Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2024 Sep 11;62(9):e0062824.
doi: 10.1128/jcm.00628-24. Epub 2024 Aug 19.

A multicenter study on accuracy and reproducibility of nanopore sequencing-based genotyping of bacterial pathogens

Affiliations
Multicenter Study

A multicenter study on accuracy and reproducibility of nanopore sequencing-based genotyping of bacterial pathogens

Johanna Dabernig-Heinz et al. J Clin Microbiol. .

Abstract

Nanopore sequencing has shown the potential to democratize genomic pathogen surveillance due to its ease of use and low entry cost. However, recent genotyping studies showed discrepant results compared to gold-standard short-read sequencing. Furthermore, although essential for widespread application, the reproducibility of nanopore-only genotyping remains largely unresolved. In our multicenter performance study involving five laboratories, four public health-relevant bacterial species were sequenced with the latest R10.4.1 flow cells and V14 chemistry. Core genome MLST analysis of over 500 data sets revealed highly strain-specific typing errors in all species in each laboratory. Investigation of the methylation-related errors revealed consistent DNA motifs at error-prone sites across participants at read level. Depending on the frequency of incorrect target reads, this either leads to correct or incorrect typing, whereby only minimal frequency deviations can randomly determine the final result. PCR preamplification, recent basecalling model updates and an optimized polishing strategy notably diminished the non-reproducible typing. Our study highlights the potential for new errors to appear with each newly sequenced strain and lays the foundation for computational approaches to reduce such typing errors. In conclusion, our multicenter study shows the necessity for a new validation concept for nanopore sequencing-based, standardized bacterial typing, where single nucleotide accuracy is critical.

Keywords: bacterial typing; cgMLST; genomic surveillance; molecular surveillance; multicenter performance study; nanopore sequencing.

PubMed Disclaimer

Conflict of interest statement

R.C. was an employee of the company Ares Genetics. This does not affect the authors' adherence to all the journal's policies on sharing data and materials. Twenty flow cells were provided free of charge by Oxford Nanopore Technologies. However, the manufacturer did not participate in the study's design, data collection, interpretation, or any other aspects of the research.

Figures

Fig 1
Fig 1
The methodological workflow of the study consists of LR and SR sequencing of 77 isolates. The assemblies of both methods were compared using cgMLST schemes for the respective species to assess differences in the typing results for the respective strain. In addition, the results were visualized in the form of MSTs containing assemblies of both methods.
Fig 2
Fig 2
(a) cgMLST-based MST of L. monocytogenes isolates using short read data. Sequencing replicates of identical strains exhibit consistent cgMLST profiles, leading to direct clustering irrespectively of the executing laboratory. (b) MST of the same L. monocytogenes isolates, augmented with LR assemblies from participating laboratories (in grayscale). Only one SR data set is shown (in green). Depending on the strain under investigation, LR assemblies of different participants showed inconsistent typing results. There are isolates where the typing of the LR matches that of the SR (an exemplary one in blue shades), but also others with differences not only to the SR but also between the LR assemblies of the participants. Furthermore, the magnitude of the observed differences varied between isolates (an exemplary one in yellow and one in red shades). For a clearer presentation, we only show the differences for selected strains; the differences at the isolate level are detailed in Fig. 3.
Fig 3
Fig 3
Allelic differences between the assemblies of the different participants compared to the short-read reference of the respective strains, showing species and isolate-specific differences regarding the number of affected strains and the magnitude of the differences. Isolates with the same ST show a similar error range compared to the assemblies of the individual participants. The cluster threshold for the respective cgMLST is added as a black line. Typing errors in assemblies that are close to this threshold are highly problematic in, e.g., outbreak investigations.
Fig 4
Fig 4
Sequence logos based on ambiguous positions and surrounding bases in a genome reveal conserved sequence patterns and high strain-level agreement between participants. Ambiguous sites identified were purine (R; A or G) or pyrimidine (Y; C or T) discrepancies.
Fig 5
Fig 5
The MST of L. monocytogenes (LM), constructed using both SR (green) reference data and LR (in grayscale) data under the latest optimal conditions (sampling at 5 kHz, 400 bp/s, bacterial methylation model basecalling, polishing with Racon and Medaka using its variant model), vividly illustrates the significant enhancements achieved in nanopore sequencing. To further assess this progress, LR data from two additional participants were generated and analyzed for three selected isolates per species, encompassing one good and two challenging isolates (marked in circles). While this analysis highlights improvements, it also underscores the presence of errors that still pose challenges to the reproducibility of typing, as exemplified by strains like LM46. All distances from the re-sequenced LR assemblies to the reference SR assemblies not being zero are denoted in black numbers.

References

    1. Wu F, Zhao S, Yu B, Chen Y-M, Wang W, Song Z-G, Hu Y, Tao Z-W, Tian J-H, Pei Y-Y, Yuan M-L, Zhang Y-L, Dai F-H, Liu Y, Wang Q-M, Zheng J-J, Xu L, Holmes EC, Zhang Y-Z. 2020. A new coronavirus associated with human respiratory disease in China. Nature 579:265–269. doi: 10.1038/s41586-020-2008-3 - DOI - PMC - PubMed
    1. Meehan CJ, Goig GA, Kohl TA, Verboven L, Dippenaar A, Ezewudo M, Farhat MR, Guthrie JL, Laukens K, Miotto P, et al. 2019. Whole genome sequencing of Mycobacterium tuberculosis: current standards and open issues. Nat Rev Microbiol 17:533–545. doi: 10.1038/s41579-019-0214-5 - DOI - PubMed
    1. Quick J, Grubaugh ND, Pullan ST, Claro IM, Smith AD, Gangavarapu K, Oliveira G, Robles-Sikisaka R, Rogers TF, Beutler NA, et al. 2017. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples. Nat Protoc 12:1261–1276. doi: 10.1038/nprot.2017.066 - DOI - PMC - PubMed
    1. Djordjevic SP, Jarocki VM, Seemann T, Cummins ML, Watt AE, Drigo B, Wyrsch ER, Reid CJ, Donner E, Howden BP. 2024. Genomic surveillance for antimicrobial resistance - a One Health perspective. Nat Rev Genet 25:142–157. doi: 10.1038/s41576-023-00649-y - DOI - PubMed
    1. Snell LB, Cliff PR, Charalampous T, Alcolea-Medina A, Ebie S, Sehmi JK, Flaviani F, Batra R, Douthwaite ST, Edgeworth JD, Nebbia G. 2021. Rapid genome sequencing in hospitals to identify potential vaccine-escape SARS-CoV-2 variants. Lancet Infect Dis 21:1351–1352. doi: 10.1016/S1473-3099(21)00482-5 - DOI - PMC - PubMed

Publication types

LinkOut - more resources