Satellog: a database for the identification and prioritization of satellite repeats in disease association studies
- PMID: 15949044
- PMCID: PMC1181805
- DOI: 10.1186/1471-2105-6-145
Satellog: a database for the identification and prioritization of satellite repeats in disease association studies
Abstract
Background: To date, 35 human diseases, some of which also exhibit anticipation, have been associated with unstable repeats. Anticipation has been reported in a number of diseases in which repeat expansion may have a role in etiology. Despite the growing importance of unstable repeats in disease, currently no resource exists for the prioritization of repeats. Here we present Satellog, a database that catalogs all pure 1-16 repeat unit satellite repeats in the human genome along with supplementary data. Satellog analyzes each pure repeat in UniGene clusters for evidence of repeat polymorphism.
Results: A total of 5,546 such repeats were identified, providing the first indication of many novel polymorphic sites in the genome. Overall, polymorphic repeats were over-represented within 3'-UTR sequence relative to 5'-UTR and coding sequence. Interestingly, we observed that repeat polymorphism within coding sequence is restricted to trinucleotide repeats whereas UTR sequence tolerated a wider range of repeat period polymorphisms. For each pure repeat we also calculate its repeat length percentile rank, its location either within or adjacent to EnsEMBL genes, and its expression profile in normal tissues according to the GeneNote database.
Conclusion: Satellog provides the ability to dynamically prioritize repeats based on any of their characteristics (i.e. repeat unit, class, period, length, repeat length percentile rank, genomic co-ordinates), polymorphism profile within UniGene, proximity to or presence within gene regions (i.e. cds, UTR, 15 kb upstream etc.), metadata of the genes they are detected within and gene expression profiles within normal human tissues. Unstable repeats associated with 31 diseases were analyzed in Satellog to evaluate their common repeat properties. The utility of Satellog was highlighted by prioritizing repeats for Huntington's disease and schizophrenia. Satellog is available online at http://satellog.bcgsc.ca.
Figures






Similar articles
-
3'-UTR SIRF: a database for identifying clusters of whort interspersed repeats in 3' untranslated regions.BMC Bioinformatics. 2007 Jul 30;8:274. doi: 10.1186/1471-2105-8-274. BMC Bioinformatics. 2007. PMID: 17663765 Free PMC article.
-
FREP: a database of functional repeats in mouse cDNAs.Nucleic Acids Res. 2004 Jan 1;32(Database issue):D471-5. doi: 10.1093/nar/gkh123. Nucleic Acids Res. 2004. PMID: 14681460 Free PMC article.
-
Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains: a web-based resource.BMC Bioinformatics. 2004 Jan 12;5:4. doi: 10.1186/1471-2105-5-4. BMC Bioinformatics. 2004. PMID: 14715089 Free PMC article.
-
Microsatellite and trinucleotide-repeat evolution: evidence for mutational bias and different rates of evolution in different lineages.Philos Trans R Soc Lond B Biol Sci. 1999 Jun 29;354(1386):1095-9. doi: 10.1098/rstb.1999.0465. Philos Trans R Soc Lond B Biol Sci. 1999. PMID: 10434312 Free PMC article. Review.
-
Features of trinucleotide repeat instability in vivo.Cell Res. 2008 Jan;18(1):198-213. doi: 10.1038/cr.2008.5. Cell Res. 2008. PMID: 18166978 Review.
Cited by
-
A novel trinucleotide repeat expansion at chromosome 3q26.2 identified by a CAG/CTG repeat expansion detection array.Hum Genet. 2006 Sep;120(2):193-200. doi: 10.1007/s00439-006-0207-0. Epub 2006 Jun 17. Hum Genet. 2006. PMID: 16783570
-
An online conserved SSR discovery through cross-species comparison.Adv Appl Bioinform Chem. 2009;2:23-35. doi: 10.2147/aabc.s4744. Epub 2009 Feb 8. Adv Appl Bioinform Chem. 2009. PMID: 21918613 Free PMC article.
-
RiDs db: Repeats in diseases database.Bioinformation. 2011;7(2):96-7. doi: 10.6026/97320630007096. Epub 2011 Sep 6. Bioinformation. 2011. PMID: 21938212 Free PMC article.
-
Exploring the relationship between polymorphic (TG/CA)n repeats in intron 1 regions and gene expression.Hum Genomics. 2009 Apr;3(3):236-45. doi: 10.1186/1479-7364-3-3-236. Hum Genomics. 2009. PMID: 19403458 Free PMC article.
-
A microsatellite repeat in PCA3 long non-coding RNA is associated with prostate cancer risk and aggressiveness.Sci Rep. 2017 Dec 4;7(1):16862. doi: 10.1038/s41598-017-16700-y. Sci Rep. 2017. PMID: 29203868 Free PMC article.
References
-
- Verkerk AJ, Pieretti M, Sutcliffe JS, Fu YH, Kuhl DP, Pizzuti A, Reiner O, Richards S, Victoria MF, Zhang FP, et al. Identification of a gene (FMR-1) containing a CGG repeat coincident with a breakpoint cluster region exhibiting length variation in fragile X syndrome. Cell. 1991;65:905–914. doi: 10.1016/0092-8674(91)90397-H. - DOI - PubMed
-
- Kremer EJ, Pritchard M, Lynch M, Yu S, Holman K, Baker E, Warren ST, Schlessinger D, Sutherland GR, Richards RI. Mapping of DNA instability at the fragile X to a trinucleotide repeat sequence p(CCG)n. Science. 1991;252:1711–1714. - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical