Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 May 30:2023.05.30.542849.
doi: 10.1101/2023.05.30.542849.

The variation and evolution of complete human centromeres

Affiliations

The variation and evolution of complete human centromeres

Glennis A Logsdon et al. bioRxiv. .

Update in

  • The variation and evolution of complete human centromeres.
    Logsdon GA, Rozanski AN, Ryabov F, Potapova T, Shepelev VA, Catacchio CR, Porubsky D, Mao Y, Yoo D, Rautiainen M, Koren S, Nurk S, Lucas JK, Hoekzema K, Munson KM, Gerton JL, Phillippy AM, Ventura M, Alexandrov IA, Eichler EE. Logsdon GA, et al. Nature. 2024 May;629(8010):136-145. doi: 10.1038/s41586-024-07278-3. Epub 2024 Apr 3. Nature. 2024. PMID: 38570684 Free PMC article.

Abstract

We completely sequenced and assembled all centromeres from a second human genome and used two reference sets to benchmark genetic, epigenetic, and evolutionary variation within centromeres from a diversity panel of humans and apes. We find that centromere single-nucleotide variation can increase by up to 4.1-fold relative to other genomic regions, with the caveat that up to 45.8% of centromeric sequence, on average, cannot be reliably aligned with current methods due to the emergence of new α-satellite higher-order repeat (HOR) structures and two to threefold differences in the length of the centromeres. The extent to which this occurs differs depending on the chromosome and haplotype. Comparing the two sets of complete human centromeres, we find that eight harbor distinctly different α-satellite HOR array structures and four contain novel α-satellite HOR variants in high abundance. DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by at least 500 kbp-a property not readily associated with novel α-satellite HORs. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan, and macaque genomes. Comparative analyses reveal nearly complete turnover of α-satellite HORs, but with idiosyncratic changes in structure characteristic to each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the p- and q-arms of human chromosomes and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.

PubMed Disclaimer

Conflict of interest statement

COMPETING INTERESTS SN is now an employee of Oxford Nanopore Technologies, Inc.; SK has received travel funds to speak at events hosted by Oxford Nanopore Technologies, Inc.; EEE is a scientific advisory board member of Variant Bio, Inc.

Figures

Figure 1.
Figure 1.. Overview of the centromeric genetic and epigenetic variation between two human genomes.
Complete assembly of centromeres from two hydatidiform moles, CHM1 and CHM13, reveals both small- and large-scale variation in centromere sequence, structure, and epigenetic landscape. The CHM1 and CHM13 centromeres are shown on the left and right between each pair of chromosomes, respectively. The length of the α-satellite higher-order repeat (HOR) array(s) is indicated, and the location of centromeric chromatin, marked by the presence of the histone H3 variant CENP-A, is indicated by a dark red circle.
Figure 2.
Figure 2.. Variation in sequence and structure between two sets of human centromeres.
a) Dot matrix plots showing allelic variation between CHM1 and CHM13 centromeric/pericentromeric haplotypes. Diagonal lines are colored by % sequence identity. The α-satellite HOR structure is shown on the axes, along with the organization of each centromeric/pericentromeric region. b) Comparison of the % sequence identity and # of Mbp aligned for 112 human centromere haplotypes from the HPRC and HGSVC mapped to the complete CHM1 and CHM13 centromere assemblies. Note that each dot represents a haplotype with 1:1 best mapping, although many of the centromeres are not yet complete in the HPRC/HGSVC samples. c) Plot showing the length of the active α-satellite HOR arrays among the CHM1 (red), CHM13 (black), and complete HPRC/HGSVC centromeres (various colors); n = 626. The α-satellite HOR arrays range in size from 0.03 Mbp on chromosome 4 to 6.5 Mbp on chromosome 11. Mean, solid black bar; 25% and 75% quartiles, dotted black bars.
Figure 3.
Figure 3.. Variation in length and sequence composition of human centromeric α-satellite HOR arrays.
a) Ratio of the length of the active α-satellite HOR arrays in the CHM1 genome compared to those in the CHM13 genome. b,c) Comparison of the b) CHM1 and CHM13 chromosome 5 D5Z2 α-satellite HOR arrays and c) CHM1 and CHM13 chromosome 11 D11Z1 α-satellite HOR arrays. The CHM1 chromosome 5 D5Z2 array contains two novel α-satellite HOR variants as well as a new evolutionary layer (Layer 4; indicated with an arrow), which is absent from the CHM13 array. Similarly, the CHM1 chromosome 11 D11Z1 α-satellite HOR array contains a 6-monomer HOR variant that is much more abundant than in the CHM13 array and comprises a new evolutionary layer (Layer 4; indicated with an arrow), although this 1.21-Mbp segment is more highly identical to the flanking sequence. The inset shows each of the new evolutionary layers with a higher stringency of sequence identity, as well as the relative position of the kinetochore.
Figure 4.
Figure 4.. Variation in the site of the kinetochore among two sets of human centromeres.
a) Plot comparing the length of the kinetochore site, marked by hypomethylated DNA and CENP-A-containing chromatin, between the CHM1 and CHM13 centromeres. b) Plot showing the difference in the position of the kinetochore among the CHM1 and CHM13 centromeres. c,d) Discovery of two potential kinetochores on the c) chromosome 13 and d) chromosome 19 centromeres in the CHM1 genomes. The presence of two hypomethylated regions enriched with CENP-A chromatin likely represents two populations of cells, which may have arisen due to a somatic mutation, resulting in differing epigenetic landscapes. e) Comparison of the CHM1 and CHM13 chromosome 6 centromeres, which differ in kinetochore position by 2.4 Mbp. f) Comparison of the CHM1 and CHM13 chromosome 5 centromeres, showing that the sequences underlying the CHM1 kinetochore are conserved in approximately half of the HPRC genomes, but the same degree of conservation is not observed for the CHM13 kinetochore region.
Figure 5.
Figure 5.. Sequence and structure of six sets of centromeres from diverse primate species.
Complete assembly of centromeres from chromosomes 5, 10, 12, 20, 21, and X in human, chimpanzee, orangutan, and macaque reveals diverse α-satellite HOR organization and evolutionary landscapes. Sequence identity maps generated via StainedGlass are shown for each centromere (Methods), with the size of the α-satellite higher-order (human, chimpanzee, and orangutan) or dimeric (macaque) repeat array indicated in Mbp. The α-satellite suprachromosomal family (SF) for each centromeric array is indicated (vertical bar color), with arrows illustrating the orientation of the repeats within the array. Chromosome 12 in orangutan has a neocentromere, while the chromosome 21 centromere in macaque is no longer active due to a chromosomal fusion in that lineage. All chromosomes are labeled according to the human phylogenetic group nomenclature. The human diploid genome used as a control (second column) is HG00733—a 1000 Genomes sample of Puerto Rican origin. We note that the orangutan and macaque centromeres are drawn at half the scale with respect to the other apes.
Figure 6.
Figure 6.. Centromeres evolve with different evolutionary trajectories and mutation rates.
a-c) Phylogenetic trees of human, chimpanzee, orangutan, and macaque α-satellites from the higher-order and monomeric α-satellite regions of the chromosome 5, 12, and X centromeres, respectively. d-f) Plot showing the mutation rate of the chromosome 5, 12, and X centromeric regions, respectively. Individual data points from 10 kbp pairwise sequence alignments are shown.
Figure 7.
Figure 7.. Phylogenetic reconstruction of human centromeric haplotypes and the saltatory amplification of new α-satellite HORs.
a) Strategy to determine the phylogeny and divergence times of completely sequenced centromeres using monomeric α-satellite or unique sequence flanking the canonical α-satellite HOR array from both the short (p) and long (q) arms of chromosomes 11 and 12. Chimpanzee is used as an outgroup with an estimated species divergence time of 6 million years ago. b,c) Maximum-likelihood phylogenetic trees depicting the p- and q-arm topologies along with the estimated divergence times reveals a monophyletic origin for the emergence of new α-satellite HORs within the b) chromosome 12 (D12Z3) and c) chromosome 11 (D11Z1) α-satellite HOR arrays. These arrays show a complex pattern of new α-satellite HOR insertions and deletions over a short period of evolutionary time.

Similar articles

References

    1. Chaisson M. J. P. et al. Resolving the complexity of the human genome using single-molecule sequencing. Nature 517, 608–611 (2015). - PMC - PubMed
    1. Nurk S. et al. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res. gr.263566.120 (2020) doi:10.1101/gr.263566.120. - DOI - PMC - PubMed
    1. Nurk S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022). - PMC - PubMed
    1. Altemose N. et al. Complete genomic and epigenetic maps of human centromeres. Science 376, eabl4178 (2022). - PMC - PubMed
    1. Vollger M. R. et al. Segmental duplications and their variation in a complete human genome. Science 376, eabj6965 (2022). - PMC - PubMed

Publication types