The closest BLAST hit is often not the nearest neighbor
- PMID: 11443357
- DOI: 10.1007/s002390010184
The closest BLAST hit is often not the nearest neighbor
Abstract
It is well known that basing phylogenetic reconstructions on uncorrected genetic distances can lead to errors in their reconstruction. Nevertheless, it is often common practice to report simply the most similar BLAST (Altschul et al. 1997) hit in genomic reports that discuss many genes (Ruepp et al. 2000; Freiberg et al. 1997). This is because BLAST hits can provide a rapid, efficient, and concise analysis of many genes at once. These hits are often interpreted to imply that the gene is most closely related to the gene or protein in the databases that returned the closest BLAST hit. Though these two may coincide, for many genes, particularly genes with few homologs, they may not be the same. There are a number of circumstances that can account for such limitations in accuracy (Eisen 2000). We stress here that genes appearing to be the most similar based on BLAST hits are often not each others closest relative phylogenetically. The extent to which this occurs depends on the availability of close relatives present in the databases. As an example we have chosen the analysis of the genomes of a crenarcheaota species Aeropyrum pernix, an organism with few close relatives fully sequenced, and Escherichia coli, an organism whose closest relative, Salmonella typhimurium, is completely sequenced.
Similar articles
-
Identify protein-coding genes in the genomes of Aeropyrum pernix K1 and Chlorobium tepidum TLS.J Biomol Struct Dyn. 2009 Feb;26(4):413-20. doi: 10.1080/07391102.2009.10507256. J Biomol Struct Dyn. 2009. PMID: 19108580
-
Computational identification of strain-, species- and genus-specific proteins.BMC Bioinformatics. 2005 Nov 23;6:279. doi: 10.1186/1471-2105-6-279. BMC Bioinformatics. 2005. PMID: 16305751 Free PMC article.
-
Characterization of the hemA-prs region of the Escherichia coli and Salmonella typhimurium chromosomes: identification of two open reading frames and implications for prs expression.J Gen Microbiol. 1993 Feb;139(2):259-66. doi: 10.1099/00221287-139-2-259. J Gen Microbiol. 1993. PMID: 7679718
-
Phylogenomic analysis of proteins that are distinctive of Archaea and its main subgroups and the origin of methanogenesis.BMC Genomics. 2007 Mar 29;8:86. doi: 10.1186/1471-2164-8-86. BMC Genomics. 2007. PMID: 17394648 Free PMC article.
-
Systems for categorizing functions of gene products.Curr Opin Struct Biol. 1998 Jun;8(3):388-92. doi: 10.1016/s0959-440x(98)80074-2. Curr Opin Struct Biol. 1998. PMID: 9666336 Review.
Cited by
-
Phylogenetic placement of metagenomic reads using the minimum evolution principle.BMC Genomics. 2015;16 Suppl 1(Suppl 1):S13. doi: 10.1186/1471-2164-16-S1-S13. Epub 2015 Jan 15. BMC Genomics. 2015. PMID: 25923672 Free PMC article.
-
Inferring horizontal gene transfer.PLoS Comput Biol. 2015 May 28;11(5):e1004095. doi: 10.1371/journal.pcbi.1004095. eCollection 2015 May. PLoS Comput Biol. 2015. PMID: 26020646 Free PMC article.
-
Potential benefits of the application of yeast starters in table olive processing.Front Microbiol. 2012 Apr 27;5:34. doi: 10.3389/fmicb.2012.00161. eCollection 2012. Front Microbiol. 2012. PMID: 22558000 Free PMC article.
-
DarkHorse: a method for genome-wide prediction of horizontal gene transfer.Genome Biol. 2007;8(2):R16. doi: 10.1186/gb-2007-8-2-r16. Genome Biol. 2007. PMID: 17274820 Free PMC article.
-
A phylogenomic approach to bacterial phylogeny: evidence of a core of genes sharing a common history.Genome Res. 2002 Jul;12(7):1080-90. doi: 10.1101/gr.187002. Genome Res. 2002. PMID: 12097345 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials