Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec 12:16:1056.
doi: 10.1186/s12864-015-2265-y.

Improved taxonomic assignment of human intestinal 16S rRNA sequences by a dedicated reference database

Affiliations

Improved taxonomic assignment of human intestinal 16S rRNA sequences by a dedicated reference database

Jarmo Ritari et al. BMC Genomics. .

Abstract

Background: Current sequencing technology enables taxonomic profiling of microbial ecosystems at high resolution and depth by using the 16S rRNA gene as a phylogenetic marker. Taxonomic assignation of newly acquired data is based on sequence comparisons with comprehensive reference databases to find consensus taxonomy for representative sequences. Nevertheless, even with well-characterised ecosystems like the human intestinal microbiota it is challenging to assign genus and species level taxonomy to 16S rRNA amplicon reads. A part of the explanation may lie in the sheer size of the search space where competition from a multitude of highly similar sequences may not allow reliable assignation at low taxonomic levels. However, when studying a particular environment such as the human intestine, it can be argued that a reference database comprising only sequences that are native to the environment would be sufficient, effectively reducing the search space.

Results: We constructed a 16S rRNA gene database based on high-quality sequences specific for human intestinal microbiota, resulting in curated data set consisting of 2473 unique prokaryotic species-like groups and their taxonomic lineages, and compared its performance against the Greengenes and Silva databases. The results showed that regardless of used assignment algorithm, our database improved taxonomic assignation of 16S rRNA sequencing data by enabling significantly higher species and genus level assignation rate while preserving taxonomic diversity and demanding less computational resources.

Conclusion: The curated human intestinal 16S rRNA gene taxonomic database of about 2500 species-like groups described here provides a practical solution for significantly improved taxonomic assignment for phylogenetic studies of the human intestinal microbiota.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Main steps in the construction of the human intestinal tract 16S taxonomic database (HITdb). Human intestinal specific sequences were pulled down from the Greengenes and Silva databases using Genbank sequences. Obtained sequence data were clustered at 97 % identity by using cultivable human intestinal species as a reference. A cultivable nearest neighbour was determined for each OTU. The taxonomic lineages were determined based cultivable species taxonomy, Greengenes and manual curating
Fig. 2
Fig. 2
Taxonomic assignment of synthetic reads. a Comparison of numbers of family, genus and species level assignments relative to missing assignments at same levels between HITdb and Greengenes databases. Taxonomy from Greengenes has been assigned with both RDP and Uclust algorithms, while taxonomy from HITdb has been assigned solely with RDP. The error bars represent upper and lower limits of 1000 bootstraps. b Venn diagrams showing genus and species level assignations between V1-V3 and V4-V6 synthetic reads for Greengenes and HITdb. The database and algorithm are indicated in columns and the taxonomic level in rows. The green and blue circles represent 16S rRNA gene regions V1-V3 and V4-V6, respectively. Jaccard index shows intersection relative to union for each diagram. The numbers represent absolute counts of assigned taxa
Fig. 3
Fig. 3
Taxonomic assignment accuracy. Proportions of correctly assigned synthetic reads relative to the total number of assignments (left) and relative to assigned sequences only (right). The 16S sequence regions V1-V3 and V4-V6 are shown separately at genus and species levels in Greengenes and HITdb. The box plots represent variation over 1000 bootstraps
Fig. 4
Fig. 4
Comparison between Greengenes and HITdb using data from biological samples. The used data set is indicated in rows, and relative and absolute numbers of assignments at genus and species levels in columns. At species level, the results for HITdb additionally show the biological species only (i.e. without OTUs) for easier comparison with Greengenes. ***p < 0.001. n = 119 and n = 40 samples for 454 and Illumina data sets, respectively
Fig. 5
Fig. 5
Analysis of Human Microbiome Project data. a Numbers of genera and species found by Greengenes and HITdb in HMP 16S samples (n = 192). b Numbers of shared genera and species between HMP shotgun metagenomics profiling and 16S analysis of the same HMP samples by Greengenes and HITdb (n = 23). The boxplots represent variation over 1000 bootstraps
Fig. 6
Fig. 6
Correlation of relative abundances between Greengenes and HITdb. The data points represent log relative abundances of common genera between Greengenes and HITdb from 454 and Illumina 16S amplicon data sets. The data are summarised over samples for each genus (“by genus”) and summarised over genera for each sample (“by sample”). The correlation coefficients are Pearson’s

Similar articles

Cited by

References

    1. Human Microbiome Project Consortium Structure, function and diversity of the healthy human microbiome. Nature. 2012;486(7402):207–214. doi: 10.1038/nature11234. - DOI - PMC - PubMed
    1. Cheng J, Palva AM, de Vos WM, Satokari R. Contribution of the intestinal microbiota to human health: from birth to 100 years of age. Curr Top Microbiol Immunol. 2013;358:323–346. - PubMed
    1. Guarner F, Malagelada JR. Gut flora in health and disease. Lancet. 2003;361(9356):512–519. doi: 10.1016/S0140-6736(03)12489-0. - DOI - PubMed
    1. O’Hara AM, Shanahan F. The gut flora as a forgotten organ. EMBO Rep. 2006;7(7):688–693. doi: 10.1038/sj.embor.7400731. - DOI - PMC - PubMed
    1. Kau AL, Ahern PP, Griffin NW, Goodman AL, Gordon JI. Human nutrition, the gut microbiome and the immune system. Nature. 2011;474(7351):327–336. doi: 10.1038/nature10213. - DOI - PMC - PubMed