Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec;7(4):405-13.
doi: 10.1007/s12539-015-0281-x. Epub 2015 Aug 21.

MetaObtainer: A Tool for Obtaining Specified Species from Metagenomic Reads of Next-generation Sequencing

Affiliations

MetaObtainer: A Tool for Obtaining Specified Species from Metagenomic Reads of Next-generation Sequencing

Weihua Pan et al. Interdiscip Sci. 2015 Dec.

Abstract

Reads classification is an important fundamental problem in metagenomics study. With the development of next-generation sequencing, metagenome samples can be generated using much less money and time. However, the short reads generated by next-generation sequencing make the problem of reads classification much more difficult than before. None of the existing tools can assign NGS short reads to each genome accurately, which limit their use in real application. Fortunately, in many applications, it is meaningless to separate all the species in the metagenome sample from each other. That is because we usually only focus on some specified species categories in the sample and do not care about the others. There is no existing tool that is designed technically for obtaining specified species from short metagenome reads generated by next-generation sequencing. In this paper, we propose a tool named MetaObtainer to obtain the specified species from next-generation sequencing short reads. The tool synthesizes some of newest technologies for processing of short reads, so it can have better performance than other tools. It can (1) deal with next-generation sequencing reads which are shorter than 100 bp with very high accuracy (both of precision and recall are more than 90%); (2) find unknown species using the reference genomes of species which are similar with it; (3) perform well when reads of specified species are very few in the dataset; (4) handle genomes of similar abundance levels as well as different abundance levels (1:10); and (5) obtain multiple species categories from metagenome sample.

Keywords: Classification; Metagenomics; Next-generation sequencing; Short reads.

PubMed Disclaimer

Similar articles

Cited by

Publication types

MeSH terms

LinkOut - more resources