Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Nov 23:55:583-602.
doi: 10.1146/annurev-genet-071719-020519.

Variation and Evolution of Human Centromeres: A Field Guide and Perspective

Affiliations
Review

Variation and Evolution of Human Centromeres: A Field Guide and Perspective

Karen H Miga et al. Annu Rev Genet. .

Abstract

We are entering a new era in genomics where entire centromeric regions are accurately represented in human reference assemblies. Access to these high-resolution maps will enable new surveys of sequence and epigenetic variation in the population and offer new insight into satellite array genomics and centromere function. Here, we focus on the sequence organization and evolution of alpha satellites, which are credited as the genetic and genomic definition of human centromeres due to their interaction with inner kinetochore proteins and their importance in the development of human artificial chromosome assays. We provide an overview of alpha satellite repeat structure and array organization in the context of these high-quality reference data sets; discuss the emergence of variation-based surveys; and provide perspective on the role of this new source of genetic and epigenetic variation in the context of chromosome biology, genome instability, and human disease.

Keywords: centromere; epigenetics; genome; repeat; satellite DNA; variation.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Structure and evolution of alpha satellite arrays. (a) Illustration of the general genomic organization of a human centromeric region, which includes one homogeneous core made of chromosome-specific HORs (red) and the imperfect symmetrical organization of smaller arrays of various other homogeneous HORs [pseudocentromeres or inactive HOR arrays (light gray)], divergent HORs [recent relic centromeres (dark gray)], and multiple distinct divergent monomeric arrays (older relic centromeres, with blocks indicating colors describing phylogenetic assignments listed in Supplemental Table 1). These regions typically include other pericentromeric satellite classes [e.g., HSat1–HSat3 (teal)] and SDs. The entire centromeric region is defined by those sequences in the cenhap (48), presented as gray flanking regions extending into the p-arm and q-arm. Arrayed triangles indicate alpha satellite monomers and HORs of various length and structures composed of several different monomers. (b) Centromere X array haplotype maps, as determined from DXZ1 (S3CXH1L) HOR clustering and divergence data, provide evidence for block organization and gradient of divergence throughout all the layers. Classification of haplotypes is determined by phylogenetic relationships of the DXZ1 HOR repeats, revealing three distinct larger haplotypes (gray, yellow, and light purple). The larger haplotype structure (three major branches on the phylogenetic tree of haplotype consensus HORs) can be further characterized into 14 DXZ1-HOR subgroupings representing individual haplotypes (6, 65). One subbranch (white) represented by one HOR is a hybrid between two other haplotypes. The numbers in parentheses indicate the number of HORs in each clade. The dot plot for the self-aligned DXZ1 array (lighter areas have higher homogeneity) and StV map with few variant HORs (white) are also shown. (c) Kinetochore selection model for satellite array evolution. This model (see Section 2.9) proposes that selfish selection operates on the array through the amplification of the repeat (light blue) due to the association with kinetochore (green) assembly, which locates itself on repeats to which it happens to have maximal affinity. Over time, the new satellite array (light blue) replaces the original satellite array (yellow), which shrinks progressively due to the ongoing deletion process. Centromeric arrays that are no longer associated with the kinetochore are considered dead and are arranged symmetrically, flanking the live arrays. Dead arrays are depicted as light gray (oldest region), dark gray (medium old), and adjacent yellow (newly inactivated dead alpha satellite array). Abbreviations: cenhap, centromere-spanning haplotype; HOR, higher-order repeat; HOR (L): live, or HOR array associated with kinetochore assembly; HSat, classical human satellites; SD, segmental duplication; StV, structural variant of a HOR. Figure adapted from data presented in Reference .
Figure 2
Figure 2
Epigenetic characterization of three complete centromeric arrays from T2T assemblies of chr1, chrX, and chr8. Access to complete and accurate assemblies of human centromeric regions provides a new opportunity to characterize all live alpha satellite HOR arrays [shown for D1Z7, chr1-SF1 (pink); DXZ1, chrX-SF3 (blue); and D8Z2, chr8-SF2 (purple)] and adjunct dead arrays. Further, these maps offer a high-resolution study of CENP-B-binding motifs (dark green represents repeats where the motif is in forward orientation and light green represents those with a motif in reverse orientation), and pJα-binding site sequences (light purple). Note that the regions enriched in reverse motifs indicate an inversion in centromere 1, the single unique event in all of the live centromeres. With the exception of centromere 8 (where CENP-B boxes and pJα are intermixed in the live array), live arrays within centromeric regions on chromosomes 1 and X contain CENP-B boxes, and flanking divergent monomeric regions contain pJα. The map of CpG methylation in ultralong Nanopore data obtained using long-read mapping protocols (previously described in 67) reveals dips in methylation that are coincident with sites of kinetochore assembly [illustrated with enrichment of CENP-A in native ChIP-seq data (52)]. Abbreviations: CENP-A, centromere protein A; CENP-B, centromere protein B; ChIP-seq, chromatin immunoprecipitation sequencing; chr, chromosome; HOR, higher-order repeat; SF, suprachromosomal family; T2T, telomere-to-telomere.

References

    1. Aldrup-MacDonald ME, Kuo ME, Sullivan LL, Chew K, Sullivan BA. 2016. Genomic variation within alpha satellite DNA influences centromere location on human chromosomes with metastable epialleles. Genome Res. 26(10):1301–11 - PMC - PubMed
    1. Alexandrov I, Kazakov A, Tumeneva I, Shepelev V, Yurov Y. 2001. Alpha-satellite DNA of primates: old and new families. Chromosoma 110(4):253–66 - PubMed
    1. Alexandrov IA, Medvedev LI, Mashkova TD, Kisselev LL, Romanova LY, Yurov YB.1993.Definition of a new alpha satellite suprachromosomal family characterized by monomeric organization. Nucleic Acids Res. 21(9):2209–15 - PMC - PubMed
    1. Alexandrov IA, Mitkevich SP, Yurov YB. 1988. The phylogeny of human chromosome specific alpha satellites. Chromosoma 96(6):443–53 - PubMed
    1. Alkan C, Ventura M, Archidiacono N, Rocchi M, Sahinalp SC, Eichler EE. 2007. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data. PLOS Comput. Biol 3(9):1807–18 - PMC - PubMed

Publication types

LinkOut - more resources