Algorithms for locating extremely conserved elements in multiple sequence alignments
- PMID: 20021665
- PMCID: PMC2808710
- DOI: 10.1186/1471-2105-10-432
Algorithms for locating extremely conserved elements in multiple sequence alignments
Abstract
Background: In 2004, Bejerano et al. announced the startling discovery of hundreds of "ultraconserved elements", long genomic sequences perfectly conserved across human, mouse, and rat. Their announcement stimulated a flurry of subsequent research.
Results: We generalize the notion of ultraconserved element in a natural way from extraordinary human-rodent conservation to extraordinary conservation over an arbitrary set of species. We call these "Extremely Conserved Elements". There is a linear time algorithm to find all such Extremely Conserved Elements in any multiple sequence alignment, provided that the conservation is required to be across all the aligned species. For the general case of conservation across an arbitrary subset of the aligned species, we show that the question of whether there exists an Extremely Conserved Element is NP-complete. We illustrate the linear time algorithm by cataloguing all 177 Extremely Conserved Elements in the currently available 44-vertebrate whole-genome alignment, and point out some of the characteristics of these elements.
Conclusions: The NP-completeness in the case of conservation across an arbitrary subset of the aligned species implies that it is unlikely an efficient algorithm exists for this general case. Despite this fact, for the interesting case of conservation across all or most of the aligned species, our algorithm is efficient enough to be practical. The 177 Extremely Conserved Elements that we catalog demonstrate many of the characteristics of the original ultraconserved elements of Bejerano et al.
Figures
Similar articles
-
Multiple organism algorithm for finding ultraconserved elements.BMC Bioinformatics. 2008 Jan 11;9:15. doi: 10.1186/1471-2105-9-15. BMC Bioinformatics. 2008. PMID: 18186941 Free PMC article.
-
Mulan: multiple-sequence local alignment and visualization for studying function and evolution.Genome Res. 2005 Jan;15(1):184-94. doi: 10.1101/gr.3007205. Epub 2004 Dec 8. Genome Res. 2005. PMID: 15590941 Free PMC article.
-
Considerations in the identification of functional RNA structural elements in genomic alignments.BMC Bioinformatics. 2007 Jan 30;8:33. doi: 10.1186/1471-2105-8-33. BMC Bioinformatics. 2007. PMID: 17263882 Free PMC article.
-
Tuning in to the signals: noncoding sequence conservation in vertebrate genomes.Trends Genet. 2008 Jul;24(7):344-52. doi: 10.1016/j.tig.2008.04.005. Epub 2008 May 29. Trends Genet. 2008. PMID: 18514361 Review.
-
Differences between pair-wise and multi-sequence alignment methods affect vertebrate genome comparisons.Trends Genet. 2006 Apr;22(4):187-93. doi: 10.1016/j.tig.2006.02.005. Epub 2006 Feb 24. Trends Genet. 2006. PMID: 16499991 Review.
Cited by
-
Dynamic epigenetic control of highly conserved noncoding elements.PLoS One. 2014 Oct 7;9(10):e109326. doi: 10.1371/journal.pone.0109326. eCollection 2014. PLoS One. 2014. PMID: 25289637 Free PMC article.
-
Ultraconserved cDNA segments in the human transcriptome exhibit resistance to folding and implicate function in translation and alternative splicing.Nucleic Acids Res. 2011 Mar;39(6):1967-79. doi: 10.1093/nar/gkq949. Epub 2010 Nov 9. Nucleic Acids Res. 2011. PMID: 21062826 Free PMC article.
References
-
- Sakuraba Y, Kimura T, Masuya H, Noguchi H, Sezutsu H, Takahasi KR, Toyoda A, Fukumura R, Murata T, Sakaki Y, Yamamura M, Wakana S, Noda T, Shiroishi T, Gondo Y. Identification and characterization of new long conserved noncoding sequences in vertebrates. Mammalian Genome. 2008;19:703–712. doi: 10.1007/s00335-008-9152-7. - DOI - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous