A poor man's BLASTX--high-throughput metagenomic protein database search using PAUDA
- PMID: 23658416
- PMCID: PMC3866550
- DOI: 10.1093/bioinformatics/btt254
A poor man's BLASTX--high-throughput metagenomic protein database search using PAUDA
Abstract
Summary: In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs ~10,000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profiles that are highly correlated to those obtained with BLASTX. PAUDA requires <80 CPU hours to analyze a dataset of 246 million Illumina DNA reads from permafrost soil for which a previous BLASTX analysis (on a subset of 176 million reads) reportedly required 800,000 CPU hours, leading to the same clustering of samples by functional profiles.
Availability: PAUDA is freely available from: http://ab.inf.uni-tuebingen.de/software/pauda. Also supplementary method details are available from this website.
Figures


Similar articles
-
Faster sequence homology searches by clustering subsequences.Bioinformatics. 2015 Apr 15;31(8):1183-90. doi: 10.1093/bioinformatics/btu780. Epub 2014 Nov 27. Bioinformatics. 2015. PMID: 25432166 Free PMC article.
-
GHOSTX: A Fast Sequence Homology Search Tool for Functional Annotation of Metagenomic Data.Methods Mol Biol. 2017;1611:15-25. doi: 10.1007/978-1-4939-7015-5_2. Methods Mol Biol. 2017. PMID: 28451968
-
Fast and sensitive protein alignment using DIAMOND.Nat Methods. 2015 Jan;12(1):59-60. doi: 10.1038/nmeth.3176. Epub 2014 Nov 17. Nat Methods. 2015. PMID: 25402007
-
MetaCache: context-aware classification of metagenomic reads using minhashing.Bioinformatics. 2017 Dec 1;33(23):3740-3748. doi: 10.1093/bioinformatics/btx520. Bioinformatics. 2017. PMID: 28961782
-
MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices.Methods. 2016 Jun 1;102:3-11. doi: 10.1016/j.ymeth.2016.02.020. Epub 2016 Mar 21. Methods. 2016. PMID: 27012178 Review.
Cited by
-
Exploring neighborhoods in the metagenome universe.Int J Mol Sci. 2014 Jul 14;15(7):12364-78. doi: 10.3390/ijms150712364. Int J Mol Sci. 2014. PMID: 25026170 Free PMC article.
-
An evolutionary divergent pestivirus lacking the Npro gene systemically infects a whale species.Emerg Microbes Infect. 2019;8(1):1383-1392. doi: 10.1080/22221751.2019.1664940. Emerg Microbes Infect. 2019. PMID: 31526243 Free PMC article.
-
Quality control on the frontier.Front Genet. 2014 May 27;5:157. doi: 10.3389/fgene.2014.00157. eCollection 2014. Front Genet. 2014. PMID: 24904650 Free PMC article. Review.
-
Mechanistic insights into the transcriptomic and metabolomic responses of Curcuma wenyujin under high phosphorus stress.BMC Plant Biol. 2025 Feb 20;25(1):233. doi: 10.1186/s12870-025-06132-6. BMC Plant Biol. 2025. PMID: 39979802 Free PMC article.
-
COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets.PLoS One. 2015 Nov 11;10(11):e0142102. doi: 10.1371/journal.pone.0142102. eCollection 2015. PLoS One. 2015. PMID: 26561344 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources