Spectral Repeat Finder (SRF): identification of repetitive sequences using Fourier transformation
- PMID: 14976032
- DOI: 10.1093/bioinformatics/bth103
Spectral Repeat Finder (SRF): identification of repetitive sequences using Fourier transformation
Abstract
Motivation: Repetitive DNA sequences, besides having a variety of regulatory functions, are one of the principal causes of genomic instability. Understanding their origin and evolution is of fundamental importance for genome studies. The identification of repeats and their units helps in deducing the intra-genomic dynamics as an important feature of comparative genomics. A major difficulty in identification of repeats arises from the fact that the repeat units can be either exact or imperfect, in tandem or dispersed, and of unspecified length.
Results: The Spectral Repeat Finder program circumvents these problems by using a discrete Fourier transformation to identify significant periodicities present in a sequence. The specific regions of the sequence that contribute to a given periodicity are located through a sliding window analysis, and an exact search method is then used to find the repetitive units. Efficient and complete detection of repeats is provided together with interactive and detailed visualization of the spectral analysis of input sequence. We demonstrate the utility of our method with various examples that contain previously unannotated repeats. A Web server has been developed for convenient access to the automated program.
Availability: The Web server is available at http://www.imtech.res.in/raghava/srf and http://www2.imtech.res.in/raghava/srf
Similar articles
-
MREPATT: detection and analysis of exact consecutive repeats in genomic sequences.Bioinformatics. 2003 Dec 12;19(18):2475-6. doi: 10.1093/bioinformatics/btg326. Bioinformatics. 2003. PMID: 14668235
-
HomologMiner: looking for homologous genomic groups in whole genomes.Bioinformatics. 2007 Apr 15;23(8):917-25. doi: 10.1093/bioinformatics/btm048. Epub 2007 Feb 18. Bioinformatics. 2007. PMID: 17308341
-
Pattern locator: a new tool for finding local sequence patterns in genomic DNA sequences.Bioinformatics. 2006 Dec 15;22(24):3099-100. doi: 10.1093/bioinformatics/btl551. Epub 2006 Nov 8. Bioinformatics. 2006. PMID: 17095514
-
How does DNA sequence motif discovery work?Nat Biotechnol. 2006 Aug;24(8):959-61. doi: 10.1038/nbt0806-959. Nat Biotechnol. 2006. PMID: 16900144 Review. No abstract available.
-
Repetitive DNA and next-generation sequencing: computational challenges and solutions.Nat Rev Genet. 2011 Nov 29;13(1):36-46. doi: 10.1038/nrg3117. Nat Rev Genet. 2011. PMID: 22124482 Free PMC article. Review.
Cited by
-
Genetic variation within native populations of endemic silkmoth Antheraea assamensis (Helfer) from Northeast India indicates need for in situ conservation.PLoS One. 2012;7(11):e49972. doi: 10.1371/journal.pone.0049972. Epub 2012 Nov 21. PLoS One. 2012. PMID: 23185503 Free PMC article.
-
Profile-statistical periodicity of DNA coding regions.DNA Res. 2011 Oct;18(5):353-62. doi: 10.1093/dnares/dsr023. Epub 2011 Jul 25. DNA Res. 2011. PMID: 21788253 Free PMC article.
-
Detection of Highly Divergent Tandem Repeats in the Rice Genome.Genes (Basel). 2021 Mar 25;12(4):473. doi: 10.3390/genes12040473. Genes (Basel). 2021. PMID: 33806152 Free PMC article.
-
TRStalker: an efficient heuristic for finding fuzzy tandem repeats.Bioinformatics. 2010 Jun 15;26(12):i358-66. doi: 10.1093/bioinformatics/btq209. Bioinformatics. 2010. PMID: 20529928 Free PMC article.
-
Structural peculiarities of linear megaplasmid, pLMA1, from Micrococcus luteus interfere with pyrosequencing reads assembly.Biotechnol Lett. 2010 Dec;32(12):1853-62. doi: 10.1007/s10529-010-0357-y. Epub 2010 Jul 21. Biotechnol Lett. 2010. PMID: 20652620 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous