NBC update: The addition of viral and fungal databases to the Naïve Bayes classification tool
- PMID: 22293603
- PMCID: PMC3284397
- DOI: 10.1186/1756-0500-5-81
NBC update: The addition of viral and fungal databases to the Naïve Bayes classification tool
Abstract
Background: Classifying the fungal and viral content of a sample is an important component of analyzing microbial communities in environmental media. Therefore, a method to classify any fragment from these organisms' DNA should be implemented.
Results: We update the näive Bayes classification (NBC) tool to classify reads originating from viral and fungal organisms. NBC classifies a fungal dataset similarly to Basic Local Alignment Search Tool (BLAST) and the Ribosomal Database Project (RDP) classifier. We also show NBC's similarities and differences to RDP on a fungal large subunit (LSU) ribosomal DNA dataset. For viruses in the training database, strain classification accuracy is 98%, while for those reads originating from sequences not in the database, the order-level accuracy is 78%, where order indicates the taxonomic level in the tree of life.
Conclusions: In addition to being competitive to other classifiers available, NBC has the potential to handle reads originating from any location in the genome. We recommend using the Bacteria/Archaea, Fungal, and Virus databases separately due to algorithmic biases towards long genomes. The tool is publicly available at: http://nbc.ece.drexel.edu.
References
-
- Jumpponen A, Jones K, Mattox J, Yeage C. Massively parallel 454-sequencing of Quercus spp. ectomycorrhizosphere indicates differences in fungal community composition richness, and diversity among urban and rural environments. Mol Ecol. 2010;19(s1):41–53. - PubMed
LinkOut - more resources
Full Text Sources
Research Materials