Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Aug 10:2024.03.15.585215.
doi: 10.1101/2024.03.15.585215.

OCTOPUS: Disk-based, Multiplatform, Mobile-friendly Metagenomics Classifier

Affiliations

OCTOPUS: Disk-based, Multiplatform, Mobile-friendly Metagenomics Classifier

Simone Marini et al. bioRxiv. .

Update in

Abstract

Portable genomic sequencers such as Oxford Nanopore's MinION enable real-time applications in clinical and environmental health. However, there is a bottleneck in the downstream analytics when bioinformatics pipelines are unavailable, e.g., when cloud processing is unreachable due to absence of Internet connection, or only low-end computing devices can be carried on site. Here we present a platform-friendly software for portable metagenomic analysis of Nanopore data, the Oligomer-based Classifier of Taxonomic Operational and Pan-genome Units via Singletons (OCTOPUS). OCTOPUS is written in Java, reimplements several features of the popular Kraken2 and KrakenUniq software, with original components for improving metagenomics classification on incomplete/sampled reference databases, making it ideal for running on smartphones or tablets. OCTOPUS obtains sensitivity and precision comparable to Kraken2, while dramatically decreasing (4- to 16-fold) the false positive rate, and yielding high correlation on real-word data. OCTOPUS is available along with customized databases at https://github.com/DataIntellSystLab/OCTOPUS and https://github.com/Ruiz-HCI-Lab/OctopusMobile.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Calculation of k-mer minimizers using rolling hash (A), k-mer probabilistic filtering based on species’ cooccurrence (B), and compact (64- to 32-bit) k-mer indexing through Bloom filters, minimal perfect hashing and compression (C).
Figure 2.
Figure 2.
Comparison of classification performance between OCTOPUS and Kraken2. Panel (A) and (B) show the distribution of sensitivity and precision on simulated bacterial datasets (4,202 genomes, ~10 million reads), while panel (C) shows the false positive rate on non-bacterial datasets made of mammalian and viral genomes (7 genomes and 241,673 reads, 47 genomes and 3,621 reads, respectively).

References

    1. Staley C. and Sadowsky M. J., “Practical considerations for sampling and data analysis in contemporary metagenomics-based environmental studies,” J. Microbiol. Methods, vol. 154, pp. 14–18, Nov. 2018, doi: 10.1016/j.mimet.2018.09.020. - DOI - PubMed
    1. Knight R. et al., “Best practices for analysing microbiomes,” Nat. Rev. Microbiol., vol. 16, no. 7, pp. 410–422, Jul. 2018, doi: 10.1038/s41579-018-0029-9. - DOI - PubMed
    1. Chiu C. Y. and Miller S. A., “Clinical metagenomics,” Nat. Rev. Genet., vol. 20, no. 6, pp. 341–355, Jun. 2019, doi: 10.1038/s41576-019-0113-7. - DOI - PMC - PubMed
    1. Li N., Cai Q., Miao Q., Song Z., Fang Y., and Hu B., “High-Throughput Metagenomics for Identification of Pathogens in the Clinical Settings,” Small Methods, vol. 5, no. 1, p. 2000792, Jan. 2021, doi: 10.1002/smtd.202000792. - DOI - PMC - PubMed
    1. Pugh J., “The Current State of Nanopore Sequencing,” Methods Mol. Biol. Clifton NJ, vol. 2632, pp. 3–14, 2023, doi: 10.1007/978-1-0716-2996-3_1. - DOI - PubMed

Publication types