Centromere reference models for human chromosomes X and Y satellite arrays
- PMID: 24501022
- PMCID: PMC3975068
- DOI: 10.1101/gr.159624.113
Centromere reference models for human chromosomes X and Y satellite arrays
Abstract
The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes.
Figures





Similar articles
-
Linear assembly of a human centromere on the Y chromosome.Nat Biotechnol. 2018 Apr;36(4):321-323. doi: 10.1038/nbt.4109. Epub 2018 Mar 19. Nat Biotechnol. 2018. PMID: 29553574 Free PMC article.
-
Genomic characterization of large heterochromatic gaps in the human genome assembly.PLoS Comput Biol. 2014 May 15;10(5):e1003628. doi: 10.1371/journal.pcbi.1003628. eCollection 2014 May. PLoS Comput Biol. 2014. PMID: 24831296 Free PMC article.
-
Alpha-CENTAURI: assessing novel centromeric repeat sequence variation with long read sequencing.Bioinformatics. 2016 Jul 1;32(13):1921-1924. doi: 10.1093/bioinformatics/btw101. Epub 2016 Feb 24. Bioinformatics. 2016. PMID: 27153570 Free PMC article.
-
Satellite DNAs between selfishness and functionality: structure, genomics and evolution of tandem repeats in centromeric (hetero)chromatin.Gene. 2008 Feb 15;409(1-2):72-82. doi: 10.1016/j.gene.2007.11.013. Epub 2007 Dec 4. Gene. 2008. PMID: 18182173 Review.
-
Centromeric Satellite DNAs: Hidden Sequence Variation in the Human Population.Genes (Basel). 2019 May 8;10(5):352. doi: 10.3390/genes10050352. Genes (Basel). 2019. PMID: 31072070 Free PMC article. Review.
Cited by
-
Centromeres under Pressure: Evolutionary Innovation in Conflict with Conserved Function.Genes (Basel). 2020 Aug 10;11(8):912. doi: 10.3390/genes11080912. Genes (Basel). 2020. PMID: 32784998 Free PMC article. Review.
-
LINE-related component of mouse heterochromatin and complex chromocenters' composition.Chromosome Res. 2016 Sep;24(3):309-23. doi: 10.1007/s10577-016-9525-9. Epub 2016 Apr 26. Chromosome Res. 2016. PMID: 27116673
-
The Genomic Landscape of Centromeres in Cancers.Sci Rep. 2019 Aug 2;9(1):11259. doi: 10.1038/s41598-019-47757-6. Sci Rep. 2019. PMID: 31375789 Free PMC article.
-
Haplotypes spanning centromeric regions reveal persistence of large blocks of archaic DNA.Elife. 2019 Jun 25;8:e42989. doi: 10.7554/eLife.42989. Elife. 2019. PMID: 31237235 Free PMC article.
-
Functional Significance of Satellite DNAs: Insights From Drosophila.Front Cell Dev Biol. 2020 May 5;8:312. doi: 10.3389/fcell.2020.00312. eCollection 2020. Front Cell Dev Biol. 2020. PMID: 32432114 Free PMC article. Review.
References
-
- Alexandrov IA, Mitkevich SP, Yurov YB 1988. The phylogeny of human chromosome specific α satellites. Chromosoma 96: 443–453 - PubMed
-
- Chang C, Lin C 2011. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2: 1–27
Publication types
MeSH terms
Substances
Associated data
- Actions
- Actions
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous