Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 20;34(11):2039-2047.
doi: 10.1101/gr.278848.123.

Accurate bacterial outbreak tracing with Oxford Nanopore sequencing and reduction of methylation-induced errors

Affiliations

Accurate bacterial outbreak tracing with Oxford Nanopore sequencing and reduction of methylation-induced errors

Mara Lohde et al. Genome Res. .

Abstract

Our study investigates the effectiveness of Oxford Nanopore Technologies for accurate outbreak tracing by resequencing 33 isolates of a 3-year-long Klebsiella pneumoniae outbreak with Illumina short-read sequencing data as the point of reference. We detect considerable base errors through cgMLST and phylogenetic analysis of genomes sequenced with Oxford Nanopore Technologies, leading to the false exclusion of some outbreak-related strains from the outbreak cluster. Nearby methylation sites cause these errors and can also be found in other species besides K. pneumoniae Based on these data, we explore PCR-based sequencing and a masking strategy, which both successfully address these inaccuracies and ensure accurate outbreak tracing. We offer our masking strategy as a bioinformatic workflow (MPOA) to identify and mask problematic genome positions in a reference-free manner. Our research highlights limitations in using Oxford Nanopore Technologies for sequencing prokaryotic organisms, especially for investigating outbreaks. For time-critical projects that cannot wait for further technological developments by Oxford Nanopore Technologies, our study recommends either using PCR-based sequencing or using our provided bioinformatic workflow. We advise that read mapping-based quality control of genomes should be provided when publishing results.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
cgMLST typing reveals allelic differences between genomes utilizing different basecaller models and sequencing kits. The minimum spanning tree pictures four K. pneumoniae samples based on 2358 genes for pairwise comparison of allelic variations. Missing values were ignored. Nodes (samples) are connected by lines depicting the distance by numbers of allelic differences. Loci are considered different if one or more bases change between the samples. Loci without allelic differences are described as being the same. Samples with allelic differences of 15 or fewer are considered as part of the cluster. All isolates were prepared with Kit 14 and Kit 12 and basecalled with each respective Guppy “superaccurate” basecalling model (see subsection “Basecalling and assembly” in the Methods). We basecalled all Kit 14–prepared samples with Dorado using the default and a modification-aware model (see subsection “Basecalling and assembly” in the Methods).
Figure 2.
Figure 2.
Systematic examination of ambiguous positions for frequency by strand orientation (A), conserved sequences (B), and raw Nanopore signal (C). (A) Violin chart showing the ratio between two bases within the mapped read data separated by strand orientation for 6556 ambiguous positions in 33 K. pneumoniae samples. Every ambiguous position is divided by which two bases appear and is labeled by their respective degenerative base (IUPAC nucleotide code). For example, “R” stands for a combination in which either A or G is found at that position. Each dot represents a base occurrence within the respective base combination at the ambiguous position. (B) Sequence logo of observed sequence pattern around the ambiguous bases R and Y on the chromosomal contig of K. pneumoniae for one sample. (C) Raw signal level (FAST5/POD5) of ambiguous positions (yellow) for Kit 14 (above) with methylated bases and SQK-RPB114.24 without modifications (below). Less clear signals are observable in ambiguous positions (yellow) for Kit 14. Signal plots were generated with remora (v.2.1.3; https://github.com/nanoporetech/remora).
Figure 3.
Figure 3.
PCR-based sequencing or masking of ambiguous positions reduces allelic or phylogenetic distances. (A) Minimum spanning trees (pairwise ignore missing values) of each of eight K. pneumoniae outbreak samples based on 2358 genes to compare allelic differences between Illumina and Nanopore SQK-NBD114.24 genomes (Kit 14; left) and Illumina and Nanopore SQK-RPB114.24 genomes (PCR; right). Nodes (samples) are connected by lines depicting the distance by numbers of allelic differences. Loci are considered different whether one or more bases change between the samples. Loci without allelic differences are described as being the same. Samples with allelic differences of 15 or fewer are considered as part of the cluster. (B) Phylogenetic tree based on core genome SNP alignment between eight K. pneumoniae outbreak samples (colored nodes), prepared with Illumina (ill), Nanopore SQK-NBD114.24 (Kit 14), and SQK-RPB114.24 (Kit 14 PCR) compared with the masked Kit 14 assemblies (masked).
Figure 4.
Figure 4.
MPOA workflow to mask ambiguous and low coverage (0×–10× sequencing depth) positions in genome files. The workflow provides masked assemblies containing all contigs and separate masked chromosomes and plasmid FASTA files. A FASTA file per sample is also generated for each ambiguous position plus surrounding bases for further analysis (Supplemental Code 2; https://github.com/replikation/MPOA).

Similar articles

Cited by

References

    1. Abe R, Oyama F, Akeda Y, Nozaki M, Hatachi T, Okamoto Y, Yoshida H, Hamaguchi S, Tomono K, Matsumoto Y, et al. 2021. Hospital-wide outbreaks of carbapenem-resistant Enterobacteriaceae horizontally spread through a clonal plasmid harbouring blaIMP-1 in children's hospitals in Japan. J Antimicrob Chemother 76: 3314–3317. 10.1093/jac/dkab303 - DOI - PubMed
    1. Argimón S, Abudahab K, Goater RJE, Fedosejev A, Bhai J, Glasner C, Feil EJ, Holden MTG, Yeats CA, Grundmann H, et al. 2016. Microreact: visualizing and sharing data for genomic epidemiology and phylogeography. Microb Genom 2: e000093. 10.1099/mgen.0.000093 - DOI - PMC - PubMed
    1. Bialek-Davenet S, Criscuolo A, Ailloud F, Passet V, Jones L, Delannoy-Vieillard A-S, Garin B, Le Hello S, Arlet G, Nicolas-Chanoine M-H, et al. 2014. Genomic definition of hypervirulent and multidrug-resistant Klebsiella pneumoniae clonal groups. Emerg Infect Dis 20: 1812–1820. 10.3201/eid2011.140206 - DOI - PMC - PubMed
    1. Brandt C, Viehweger A, Singh A, Pletz MW, Wibberg D, Kalinowski J, Lerch S, Müller B, Makarewicz O. 2019. Assessing genetic diversity and similarity of 435 KPC-carrying plasmids. Sci Rep 9: 11223. 10.1038/s41598-019-47758-5 - DOI - PMC - PubMed
    1. Brandt C, Krautwurst S, Spott R, Lohde M, Jundzill M, Marquet M, Hölzer M. 2021. poreCov—an easy to use, fast, and robust workflow for SARS-CoV-2 genome reconstruction via nanopore sequencing. Front Genet 12: 711437. 10.3389/fgene.2021.711437 - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources