Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 3:13:796465.
doi: 10.3389/fmicb.2022.796465. eCollection 2022.

Comparing Long-Read Assemblers to Explore the Potential of a Sustainable Low-Cost, Low-Infrastructure Approach to Sequence Antimicrobial Resistant Bacteria With Oxford Nanopore Sequencing

Affiliations

Comparing Long-Read Assemblers to Explore the Potential of a Sustainable Low-Cost, Low-Infrastructure Approach to Sequence Antimicrobial Resistant Bacteria With Oxford Nanopore Sequencing

Ian Boostrom et al. Front Microbiol. .

Abstract

Long-read sequencing (LRS) can resolve repetitive regions, a limitation of short read (SR) data. Reduced cost and instrument size has led to a steady increase in LRS across diagnostics and research. Here, we re-basecalled FAST5 data sequenced between 2018 and 2021 and analyzed the data in relation to gDNA across a large dataset (n = 200) spanning a wide GC content (25-67%). We examined whether re-basecalled data would improve the hybrid assembly, and, for a smaller cohort, compared long read (LR) assemblies in the context of antimicrobial resistance (AMR) genes and mobile genetic elements. We included a cost analysis when comparing SR and LR instruments. We compared the R9 and R10 chemistries and reported not only a larger yield but increased read quality with R9 flow cells. There were often discrepancies with ARG presence/absence and/or variant detection in LR assemblies. Flye-based assemblies were generally efficient at detecting the presence of ARG on both the chromosome and plasmids. Raven performed more quickly but inconsistently recovered small plasmids, notably a ∼15-kb Col-like plasmid harboring bla KPC . Canu assemblies were the most fragmented, with genome sizes larger than expected. LR assemblies failed to consistently determine multiple copies of the same ARG as identified by the Unicycler reference. Even with improvements to ONT chemistry and basecalling, long-read assemblies can lead to misinterpretation of data. If LR data are currently being relied upon, it is necessary to perform multiple assemblies, although this is resource (computing) intensive and not yet readily available/useable.

Keywords: Guppy; MinION; Oxford Nanopore Technology (ONT); antimicrobial resistance (AMR); antimicrobial resistance genes (ARG); de novo assembly; long-read sequencing (LRS); plasmid.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
A flow diagram summarizing the approach and methods taken for the three stages of analysis. (1) Orange, Isolate selection, re-basecalling with Guppy v.5.0.11 and comparison of long read metrics, (2) green, Repeat hybrid assembly with Unicycler v0.4.9 and re-basecalled long reads as input (–1 parameter), (3) purple, Additional long-read only assembly comparison.
FIGURE 2
FIGURE 2
A flow diagram summarizing the dataset, number of exclusions and the number of isolates selected for the two-stage downstream analyses; comparative hybrid assembly (Unicycler) with re-basecalled (Guppy v5.0. 11) fastq (n = 62/200), and additional comparison with multiple long read assemblers (n = 25/200).
FIGURE 3
FIGURE 3
Scatter plots summarizing the hybrid assembly metrics and gDNA extraction yield (ng/μl), (A–C) compares the N50 value against the total genome size, the %GC content and gDNA yield input and graphs (D–F) compares the number of contigs (over 1,000 bp) against the same three variables.
FIGURE 4
FIGURE 4
Box and whisker plot s comparing original fastq metrics (purple) with FAST5 reads re-basecalled with Guppy v5.0.11 (green). Three long-read metrics were assessed; number of reads (A–D), the mean read quality (E–H) and the read length N50 value (I–L) In addition to assessing overall read metrics following re-basecalling (A,C,I), data was grouped into categories for gDNA extraction (B,F,J), the gDNA yield (C,G,K) and the %GC content (D,H,L).
FIGURE 5
FIGURE 5
Mash distance matrix comparing the hybrid assembly to each long-read only assembly variation for n = 23 isolates (MIN-, only isolates where all assemblies were generated were included). The darker (blue) represents a greater genomic distance. The species/genera are indicated on the left.
FIGURE 6
FIGURE 6
A schematic to compare Klebsiella ARG detection from the Unicycler (hybrid) assembly (left), with the short read (SR) assembly (right). The stacked bar graphs (center) denote the number of LR assemblies with a 100% match to the Unicycler assembly. (A) MIN-106 with blaNDM−5, blaOXA−48 like on the same plasmid, (B) MIN-119 with two plasmid both containing blaNDM−1 and (C) MIN-129, a K. quaispneumoniae isolate containing blaKPC−2.
FIGURE 7
FIGURE 7
A schematic to compare E. coli ARG detection from the Unicycler (hybrid) assembly (left), with the short read (SR) assembly (right). The stacked bar graphs (center) denote the number of LR assemblies with a 100% match to the Unicycler assembly, (A) MIN-036 with blaNDM−5, blaTEM, and blaCTXM−15on the same plasmid, (B) MIN-040 with tet(X4),mcr-1.1 and blaTEM on the same plasmid, (C) MIN-049 a plasmid with two copies of tet(X4).
FIGURE 8
FIGURE 8
A schematic to compare A baumannii ARG detection from the Unicycler (hybrid) assembly (left), with the short read (SR) assembly (right). The stacked bar graphs (center) denote the number of LR assemblies with a 100% match to the Unicycler assembly (A) MIN-005 with blaNDM located on the chromosome, (B) MJN-010 with blaNDM−2 and tet(X3) located on the same plasmid. (C) MIN-012 with blaNDM−1 and blaOXA−58 on the same plasmid.
FIGURE 9
FIGURE 9
A schematic to compare ARG detection from the Unicycler (hybrid) assembly (left), with the short read (SR) assembly (right). The stacked bar graphs (center) denote the number of LR assemblies with a 100% match to the Unicycler assembly. (A) MIN-137, an P. aemginosa isolate with blaVIM-28 located on a plasmid. (B) An E. asburiae isolate MIN-029 with blaTEM and mcr-10 located on a plasmid, (C) MIN-182 and S. agalactiae isolate with aminoglycoside and macrolide antibiotic resistance genes.
FIGURE 10
FIGURE 10
A schematic to visualize (A) 3′ and 5′ end s of the Canu-based assemblies indicating they have both, versions of the UU375 rearrangement vs. Unicycler (as well as the adjacent non-inverted genes) at the beginning and end of the assembly, PAGKEQ repeat motifs (orange) and primers +226 and –125 (dark and light green, respectively) are shown for context of the small inverted region; (B) the alignment short reads to the Unicycler assembly of MIN-201 (Ureaplasma urealyticum), bases representing mismatches in base calls are identified with different color highlighting and contigs containing 23S rRNA mutations at A2058G (blue arrow) mediating resistance are identified at the end of the black probed region, while contigs containing wild type susceptible 23S rRNA A2058 are identified by green probed regions, (C) shows a higher resolution of the region identified by a dotted box in (B).

References

    1. Benton M. (2021). Nvidia Jetson Nanopore Sequencing: A Place to Collate Notes and Resources of Our Journey Into Porting Nanopore Sequencing Over to Accessible, Portable Technology. San Francisco, CA: GitHub.
    1. Bortolaia V., Kaas R. S., Ruppe E., Roberts M. C., Schwarz S., Cattoir V., et al. (2020). ResFinder 4.0 for predictions of phenotypes from genotypes. J. Antimicrobial Chemotherapy 75 3491–3500. 10.1093/jac/dkaa34 - DOI - PMC - PubMed
    1. Cardiff University (2021). Cardiff University eMarketplace. Cardiff: Cardiff University.
    1. Castro-Wallace S. L., Chiu C. Y., John K. K., Stahl S. E., Rubins K. H., McIntyre A. B. R., et al. (2017). Nanopore DNA sequencing and genome assembly on the international space station. Sci. Rep. 7:18022. 10.1038/s41598-017-18364-18360 - DOI - PMC - PubMed
    1. Cerdeira L. T., Lam M. M. C., Wyres K. L., Wick R. R., Judd L. M., Lopes R., et al. (2019). Small IncQ1 and col-like plasmids harboring blaKPC-2 and non-Tn4401 elements (NTEKPC-IId) in high-risk lineages of Klebsiella pneumoniae CG258. Antimicrob Agents Chemother. 63:e02140-18. 10.1128/AAC.02140-2118 - DOI - PMC - PubMed