RBR: library-less repeat detection for ESTs
- PMID: 16837527
- DOI: 10.1093/bioinformatics/btl368
RBR: library-less repeat detection for ESTs
Abstract
Motivation: Repeat sequences in ESTs are a source of problems, in particular for clustering. ESTs are therefore commonly masked against a library of known repeats. High quality repeat libraries are available for the widely studied organisms, but for most other organisms the lack of such libraries is likely to compromise the quality of EST analysis.
Results: We present a fast, flexible and library-less method for masking repeats in EST sequences, based on match statistics within the EST collection. The method is not linked to a particular clustering algorithm. Extensive testing on datasets using different clustering methods and a genomic mapping as reference shows that this method gives results that are better than or as good as those obtained using RepeatMasker with a repeat library.
Availability: The implementation of RBR is available under the terms of the GPL from http://www.ii.uib.no/~ketil/bioinformatics
Contact: ketil.malde@bccs.uib.no
Supplementary information: Supplementary data are available at Bioinformatics online.
Similar articles
-
WindowMasker: window-based masker for sequenced genomes.Bioinformatics. 2006 Jan 15;22(2):134-41. doi: 10.1093/bioinformatics/bti774. Epub 2005 Nov 15. Bioinformatics. 2006. PMID: 16287941
-
HomologMiner: looking for homologous genomic groups in whole genomes.Bioinformatics. 2007 Apr 15;23(8):917-25. doi: 10.1093/bioinformatics/btm048. Epub 2007 Feb 18. Bioinformatics. 2007. PMID: 17308341
-
Tandem repeats over the edit distance.Bioinformatics. 2007 Jan 15;23(2):e30-5. doi: 10.1093/bioinformatics/btl309. Bioinformatics. 2007. PMID: 17237101
-
Gene identification through large-scale EST sequence processing.Appl Bioinformatics. 2003;2(3):123-9. Appl Bioinformatics. 2003. PMID: 15130797 Review.
-
An overview of the wcd EST clustering tool.Bioinformatics. 2008 Jul 1;24(13):1542-6. doi: 10.1093/bioinformatics/btn203. Epub 2008 May 14. Bioinformatics. 2008. PMID: 18480101 Free PMC article. Review.
Cited by
-
Towards decrypting cryptobiosis--analyzing anhydrobiosis in the tardigrade Milnesium tardigradum using transcriptome sequencing.PLoS One. 2014 Mar 20;9(3):e92663. doi: 10.1371/journal.pone.0092663. eCollection 2014. PLoS One. 2014. PMID: 24651535 Free PMC article.
-
Repeats and EST analysis for new organisms.BMC Genomics. 2008 Jan 18;9:23. doi: 10.1186/1471-2164-9-23. BMC Genomics. 2008. PMID: 18205940 Free PMC article.
-
Sequencing, analysis, and annotation of expressed sequence tags for Camelus dromedarius.PLoS One. 2010 May 19;5(5):e10720. doi: 10.1371/journal.pone.0010720. PLoS One. 2010. PMID: 20502665 Free PMC article.
-
Characterization of an Atlantic cod (Gadus morhua) embryonic stem cell cDNA library.BMC Res Notes. 2009 May 6;2:74. doi: 10.1186/1756-0500-2-74. BMC Res Notes. 2009. PMID: 19416549 Free PMC article.
-
Transcriptome analysis of Corvus splendens reveals a repertoire of antimicrobial peptides.Sci Rep. 2023 Oct 31;13(1):18728. doi: 10.1038/s41598-023-45875-w. Sci Rep. 2023. PMID: 37907616 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials