Indel seeds for homology search

Denise Mak¹, Yevgeniy Gelfand, Gary Benson

Affiliations

PMID: 16873491
DOI: 10.1093/bioinformatics/btl263

Indel seeds for homology search

Denise Mak et al. Bioinformatics. 2006.

. 2006 Jul 15;22(14):e341-9.

doi: 10.1093/bioinformatics/btl263.

Authors

Denise Mak¹, Yevgeniy Gelfand, Gary Benson

Affiliation

¹ Graduate Program in Bioinformatics, Boston University, Boston, MA 02215, USA. dyfmak@bu.edu

PMID: 16873491
DOI: 10.1093/bioinformatics/btl263

Abstract

We are interested in detecting homologous genomic DNA sequences with the goal of locating approximate inverted, interspersed, and tandem repeats. Standard search techniques start by detecting small matching parts, called seeds, between a query sequence and database sequences. Contiguous seed models have existed for many years. Recently, spaced seeds were shown to be more sensitive than contiguous seeds without increasing the random hit rate. To determine the superiority of one seed model over another, a model of homologous sequence alignment must be chosen. Previous studies evaluating spaced and contiguous seeds have assumed that matches and mismatches occur within these alignments, but not insertions and deletions (indels). This is perhaps appropriate when searching for protein coding sequences (<5% of the human genome), but is inappropriate when looking for repeats in the majority of genomic sequence where indels are common. In this paper, we assume a model of homologous sequence alignment which includes indels and we describe a new seed model, called indel seeds, which explicitly allows indels. We present a waiting time formula for computing the sensitivity of an indel seed and show that indel seeds significantly outperform contiguous and spaced seeds when homologies include indels. We discuss the practical aspect of using indel seeds and finally we present results from a search for inverted repeats in the dog genome using both indel and spaced seeds.

PubMed Disclaimer

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources
- Ovid Technologies, Inc.
- Silverchair Information Systems

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Indel seeds for homology search

Affiliation

Indel seeds for homology search

Authors

Affiliation

Abstract

MeSH terms

Substances

LinkOut - more resources

Full Text Sources