Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jul 21:425:80-87.
doi: 10.1016/j.jtbi.2017.04.019. Epub 2017 Apr 26.

DBH: A de Bruijn graph-based heuristic method for clustering large-scale 16S rRNA sequences into OTUs

Affiliations

DBH: A de Bruijn graph-based heuristic method for clustering large-scale 16S rRNA sequences into OTUs

Ze-Gang Wei et al. J Theor Biol. .

Abstract

Recent sequencing revolution driven by high-throughput technologies has led to rapid accumulation of 16S rRNA sequences for microbial communities. Clustering short sequences into operational taxonomic units (OTUs) is an initial crucial process in analyzing metagenomic data. Although many heuristic methods have been proposed for OTU inferences with low computational complexity, they just select one sequence as the seed for each cluster and the results are sensitive to the selected sequences that represent the clusters. To address this issue, we present a de Bruijn graph-based heuristic clustering method (DBH) for clustering massive 16S rRNA sequences into OTUs by introducing a novel seed selection strategy and greedy clustering approach. Compared with existing widely used methods on several simulated and real-life metagenomic datasets, the results show that DBH has higher clustering performance and low memory usage, facilitating the overestimation of OTUs number. DBH is more effective to handle large-scale metagenomic datasets. The DBH software can be freely downloaded from https://github.com/nwpu134/DBH.git for academic users.

Keywords: 16S rRNA; Clustering; Metagenomic; Operational taxonomic units; de Bruijn graph.

PubMed Disclaimer

Similar articles

Cited by

Publication types

LinkOut - more resources