Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 1;7(5):giy054.
doi: 10.1093/gigascience/giy054.

Benchmarking taxonomic assignments based on 16S rRNA gene profiling of the microbiota from commonly sampled environments

Affiliations

Benchmarking taxonomic assignments based on 16S rRNA gene profiling of the microbiota from commonly sampled environments

Alexandre Almeida et al. Gigascience. .

Abstract

Background: Taxonomic profiling of ribosomal RNA (rRNA) sequences has been the accepted norm for inferring the composition of complex microbial ecosystems. Quantitative Insights Into Microbial Ecology (QIIME) and mothur have been the most widely used taxonomic analysis tools for this purpose, with MAPseq and QIIME 2 being two recently released alternatives. However, no independent and direct comparison between these four main tools has been performed. Here, we compared the default classifiers of MAPseq, mothur, QIIME, and QIIME 2 using synthetic simulated datasets comprised of some of the most abundant genera found in the human gut, ocean, and soil environments. We evaluate their accuracy when paired with both different reference databases and variable sub-regions of the 16S rRNA gene.

Findings: We show that QIIME 2 provided the best recall and F-scores at genus and family levels, together with the lowest distance estimates between the observed and simulated samples. However, MAPseq showed the highest precision, with miscall rates consistently <2%. Notably, QIIME 2 was the most computationally expensive tool, with CPU time and memory usage almost 2 and 30 times higher than MAPseq, respectively. Using the SILVA database generally yielded a higher recall than using Greengenes, while assignment results of different 16S rRNA variable sub-regions varied up to 40% between samples analysed with the same pipeline.

Conclusions: Our results support the use of either QIIME 2 or MAPseq for optimal 16S rRNA gene profiling, and we suggest that the choice between the two should be based on the level of recall, precision, and/or computational performance required.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Level of recall at the genus level, represented as taxa relative abundances, obtained with each analysis pipeline for the three different biomes (human gut, ocean, and soil). The number of genera correctly identified by each pipeline is indicated above the graph.
Figure 2:
Figure 2:
A) Recall, precision, and F-score estimates at the genus level for each tool and database tested. B) F-scores calculated for some of the most commonly tested sub-regions of the 16S rRNA gene: V1-V2, V3-V4, V4, and V4-V5.
Figure 3:
Figure 3:
Computational cost of each taxonomy assignment tool, estimated as the total memory usage (A) and CPU time (B) required for the processing and classification of ∼3 million sequences against the SILVA 128 database. Error bars denote standard deviation across the three biomes tested (human gut, ocean, and soil).
Figure 4:
Figure 4:
DS calculated for each genus included in the simulated datasets. Lower (brighter) values indicate a closer prediction to the true composition of the original sample. The black outline indicates the overall best scoring analysis pipeline for each environment. Taxa are ordered by decreasing abundance from left to right based on their composition in the simulated sample.
Figure 5:
Figure 5:
PCoA between all samples analysed in relation to the true, expected dataset, using the Bray-Curtis distance method.

References

    1. Forbes JD, Van Domselaar G, Bernstein CN. The gut microbiota in immune-mediated inflammatory diseases. Front Microbiol. 2016;7:1081. - PMC - PubMed
    1. Duvallet C, Gibbons SM, Gurry T et al. . Meta-analysis of gut microbiome studies identifies disease-specific and shared responses. Nat Commun. 2017;8:1784. - PMC - PubMed
    1. Turnbaugh PJ, Ley RE, Mahowald MA et al. . An obesity-associated gut microbiome with increased capacity for energy harvest. Nature. 2006;444:1027–31. - PubMed
    1. Thompson LR, Sanders JG, McDonald D et al. . A communal catalogue reveals Earth's multiscale microbial diversity. Nature. 2017;551:457–63. - PMC - PubMed
    1. Yilmaz P, Yarza P, Rapp JZ et al. . Expanding the world of marine bacterial and archaeal clades. Front Microbiol. 2016;6:1524. - PMC - PubMed

Publication types

LinkOut - more resources