This is a preprint.
De novo reconstruction of satellite repeat units from sequence data
- PMID: 37131874
- PMCID: PMC10153287
De novo reconstruction of satellite repeat units from sequence data
Update in
-
De novo reconstruction of satellite repeat units from sequence data.Genome Res. 2023 Dec 1;33(11):1994-2001. doi: 10.1101/gr.278005.123. Genome Res. 2023. PMID: 37918962 Free PMC article.
Abstract
Satellite DNA are long tandemly repeating sequences in a genome and may be organized as high-order repeats (HORs). They are enriched in centromeres and are challenging to assemble. Existing algorithms for identifying satellite repeats either require the complete assembly of satellites or only work for simple repeat structures without HORs. Here we describe Satellite Repeat Finder (SRF), a new algorithm for reconstructing satellite repeat units and HORs from accurate reads or assemblies without prior knowledge on repeat structures. Applying SRF to real sequence data, we showed that SRF could reconstruct known satellites in human and well-studied model organisms. We also found satellite repeats are pervasive in various other species, accounting for up to 12% of their genome contents but are often underrepresented in assemblies. With the rapid progress on genome sequencing, SRF will help the annotation of new genomes and the study of satellite DNA evolution even if such repeats are not fully assembled.
Conflict of interest statement
6COMPETING INTEREST STATEMENT H.L. is a consualtant for Integrated DNA Technologies, Inc.
Figures


Similar articles
-
De novo reconstruction of satellite repeat units from sequence data.Genome Res. 2023 Dec 1;33(11):1994-2001. doi: 10.1101/gr.278005.123. Genome Res. 2023. PMID: 37918962 Free PMC article.
-
Discovery of 33mer in chromosome 21 - the largest alpha satellite higher order repeat unit among all human somatic chromosomes.Sci Rep. 2019 Sep 2;9(1):12629. doi: 10.1038/s41598-019-49022-2. Sci Rep. 2019. PMID: 31477765 Free PMC article.
-
High Satellite Repeat Turnover in Great Apes Studied with Short- and Long-Read Technologies.Mol Biol Evol. 2019 Nov 1;36(11):2415-2431. doi: 10.1093/molbev/msz156. Mol Biol Evol. 2019. PMID: 31273383 Free PMC article.
-
Key-string algorithm--novel approach to computational analysis of repetitive sequences in human centromeric DNA.Croat Med J. 2003 Aug;44(4):386-406. Croat Med J. 2003. PMID: 12950141 Review.
-
Dark Matter of Primate Genomes: Satellite DNA Repeats and Their Evolutionary Dynamics.Cells. 2020 Dec 18;9(12):2714. doi: 10.3390/cells9122714. Cells. 2020. PMID: 33352976 Free PMC article. Review.
References
-
- Altemose N. (2022). A classical revival: Human satellite dnas enter the genomics era. Semin Cell Dev Biol, 128:2–14. - PubMed
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous