A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy
- PMID: 28486927
- PMCID: PMC5424349
- DOI: 10.1186/s12859-017-1670-4
A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy
Abstract
Background: Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement.
Results: We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes.
Conclusions: Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .
Keywords: 16S rRNA gene; Taxonomic classification.
Figures
Similar articles
-
Construction of habitat-specific training sets to achieve species-level assignment in 16S rRNA gene datasets.Microbiome. 2020 May 15;8(1):65. doi: 10.1186/s40168-020-00841-w. Microbiome. 2020. PMID: 32414415 Free PMC article.
-
TaxAss: Leveraging a Custom Freshwater Database Achieves Fine-Scale Taxonomic Resolution.mSphere. 2018 Sep 5;3(5):e00327-18. doi: 10.1128/mSphere.00327-18. mSphere. 2018. PMID: 30185512 Free PMC article.
-
RNA polymerase beta subunit (rpoB) gene and the 16S-23S rRNA intergenic transcribed spacer region (ITS) as complementary molecular markers in addition to the 16S rRNA gene for phylogenetic analysis and identification of the species of the family Mycoplasmataceae.Mol Phylogenet Evol. 2012 Jan;62(1):515-28. doi: 10.1016/j.ympev.2011.11.002. Epub 2011 Nov 17. Mol Phylogenet Evol. 2012. PMID: 22115576
-
Critical review of 16S rRNA gene sequencing workflow in microbiome studies: From primer selection to advanced data analysis.Mol Oral Microbiol. 2023 Oct;38(5):347-399. doi: 10.1111/omi.12434. Epub 2023 Oct 7. Mol Oral Microbiol. 2023. PMID: 37804481 Review.
-
Constructing phylogenetic trees for microbiome data analysis: A mini-review.Comput Struct Biotechnol J. 2024 Oct 24;23:3859-3868. doi: 10.1016/j.csbj.2024.10.032. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 39554614 Free PMC article. Review.
Cited by
-
Oral and vaginal microbiota in selected field mice of the genus Apodemus: a wild population study.Sci Rep. 2020 Aug 6;10(1):13246. doi: 10.1038/s41598-020-70249-x. Sci Rep. 2020. PMID: 32764739 Free PMC article.
-
The pan-microbiome profiling system Taxa4Meta identifies clinical dysbiotic features and classifies diarrheal disease.J Clin Invest. 2024 Jan 16;134(2):e170859. doi: 10.1172/JCI170859. J Clin Invest. 2024. PMID: 37962956 Free PMC article.
-
The urobiome of continent adult women: a cross-sectional study.BJOG. 2020 Jan;127(2):193-201. doi: 10.1111/1471-0528.15920. Epub 2019 Oct 9. BJOG. 2020. PMID: 31469215 Free PMC article.
-
The Influence of Modernization and Disease on the Gastric Microbiome of Orang Asli, Myanmars and Modern Malaysians.Microorganisms. 2019 Jun 14;7(6):174. doi: 10.3390/microorganisms7060174. Microorganisms. 2019. PMID: 31208001 Free PMC article.
-
Species-Level Resolution of Female Bladder Microbiota from 16S rRNA Amplicon Sequencing.mSystems. 2021 Oct 26;6(5):e0051821. doi: 10.1128/mSystems.00518-21. Epub 2021 Sep 14. mSystems. 2021. PMID: 34519534 Free PMC article.
References
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Miscellaneous