Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 2;26(1):111.
doi: 10.1186/s13059-025-03578-7.

Four near-complete genome assemblies reveal the landscape and evolution of centromeres in Salicaceae

Affiliations

Four near-complete genome assemblies reveal the landscape and evolution of centromeres in Salicaceae

Yubo Wang et al. Genome Biol. .

Abstract

Background: Centromeres play a crucial role in maintaining genomic stability during cell division. They are typically composed of large arrays of tandem satellite repeats, which hinder high-quality assembly and complicate our efforts to understand their evolution across species. Here, we use long-read sequencing to generate near-complete genome assemblies for two Populus and two Salix species belonging to the Salicaceae family and characterize the genetic and epigenetic landscapes of their centromeres.

Results: The results show that only limited satellite repeats are present as centromeric components in these species, while most of them are located outside the centromere but exhibit a homogenized structure similar to that of the Arabidopsis centromeres. Instead, the Salicaceae centromeres are mainly composed of abundant transposable elements, including CRM and ATHILA, while LINE elements are exclusively discovered in the poplar centromeres. Comparative analysis reveals that these centromeric repeats are extensively expanded and interspersed with satellite arrays in a species-specific and chromosome-specific manner, driving rapid turnover of centromeres both in sequence compositions and genomic locations in the Salicaceae.

Conclusions: Our results highlight the dynamic evolution of diverse centromeric landscapes among closely related species mediated by satellite homogenization and widespread invasions of transposable elements and shed further light on the role of centromere in genome evolution and species diversification.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Syntenic alignments among assembled Salicaceae haplotype genomes and their reference genome of P. alba var. pyramidalis (a), P. euphratica (b), S. chaenomeloides (c), and S. arbutifolia (d). The syntenic regions among haplotype I (left), reference genome (middle) and haplotype II (right) are shown by gray lines. Centromeres are shown as Khaki triangles. Telomeres are shown as black triangles at chromosome ends. Inversions between each haplotype and the reference genome are shown in orange blocks. 5S and 45S rDNA regions are shown as blue and yellow rectangles, respectively. TE density of each chromosome is shown as cyan histograms
Fig. 2
Fig. 2
Profiles of Salicaceae centromeres. a Salicaceae CENH3 genes. The phylogenetic tree (left) was constructed by CENH3 genes in assembled genomes as well as the published genome of P. trichocarpa and S. purpurea, and branches are color-coded according to CENH3 type. The schematic illustration of Salicaceae CENH3 genes (right) are shown as boxes representing exons. The two copies of CENH3 gene are identified in willows, while the CENH3-2 loses exons at the N’ terminus and became dysfunctional in poplars. b Line graph of cumulative centromere length in different genomes. c Characteristics and epigenetics for representative centromere of P. alba var. pyramidalis haplotype I, P. euphratica haplotype I, and S. chaenomeloides haplotype I. Plots from top to bottom separately represent TRAs on forward (red) and reverse (blue) strands and CENH3 CUT&Tag distribution per 10-kb, transposable element distribution per 10-kb, DNA methylation level per 10-kb, histone modification level per 10-kb, gene distribution, CRM element distribution and sequence similarity on centromeres and adjacent regions. Regions marked between grey dashlines represent centromeres. d The chromosomal-scale histone modification feature of P. alba var. pyramidalis haplotype I, and the lower figure represents the closed-up plot of centromeres. e Same as d but showing the WGBS methylation feature. f Dot plot of syntenic pericentromeres from all genome assemblies using a 500-bp search window. Red and blue points indicate forward- and reverse-strand similarly, respectively
Fig. 3
Fig. 3
Characteristics of centromeric Palv156 TRAs. a Histograms of Palv156 monomer lengths (left) and variant distances relative to the genomewide consensus (right) in P. alba var. pyramidalis haplotype I. b Histograms of monomer number (length) of Palv156 in HOR blocks (left) and the distances between HORs (right) in P. alba var. pyramidalis haplotype I. c Representative Palv156 TRA region heatmap colored according to pairwise variants between Palv156 monomers. d Dot plot distribution of Palv156 HORs distance and HORs variation levels in P. alba var. pyramidalis haplotype I. e Phylogenetic tree of sampled Palv148 monomers. The color of the outer circle and the tree branch represented the corresponding chromosome and genome haplotype, respectively. f Boxplot of Palv156 TRAs monomer similarity comparisons in both within and between chromosomes, haplotypes, and species
Fig. 4
Fig. 4
Characteristics of non-centromeric Palv148 TRAs. a Characteristics and epigenetics for Palv148 TRAs in chr09 of P. alba var. pyramidalis haplotype II. Plots from top to bottom separately represent TRAs on forward (red) and reverse (blue) strands and CENH3 CUT&Tag distribution per 10-kb, transposable element distribution per 10-kb, DNA methylation level per 10-kb, histone modification level per 10-kb, gene distribution, CRM element distribution and sequence similarity on centromeres and adjacent regions. b and c show histograms of Palv148 monomer lengths (left) and variant distances relative to the genomewide consensus (right) in P. alba var. pyramidalis haplotype I and haplotype II, respectively. d Phylogenetic tree of sampled Palv148 and Sar145 monomers. The color of the outer circle and the tree branch represented the corresponding chromosome and genome haplotype, respectively. e Boxplot of Palv148 and Sar145 TRAs monomer similarity comparisons in both within and between chromosomes, haplotypes, and species. f and g show histograms of monomer number (length) of Palv148 in HOR blocks (left) and the distances between HORs (right) in P. alba var. pyramidalis haplotype I and haplotype II, respectively. h Representative Palv148 TRA region heatmap colored according to pairwise variants between Palv148 monomers. i Hi-C interaction strengths comparisons of Palv148 TRAs with centromeres and non-centromeric regions, respectively (two-tailed Wilcoxon rank-sum test, ****P ≤ 0.0001, ***P ≤ 0.001, **P ≤ 0.01, *P ≤ 0.05, ns: not significant)
Fig. 5
Fig. 5
Analysis of CRM elements in Salicaceae genomes. a Boxplot of repetitive sequence statistics within centromeres, and the number of each boxplot represents the percentage of repetitive sequences. b Boxplot of CRM, ATHILA, LINE, and other repetitive sequence statistics within centromeres. c Phylogenetic tree of intact CRM elements. Colors of the outer and inner circle and the tree branch represent the corresponding CRM position, species, and genome haplotype, respectively. The distance scale is 0.1. d Boxplots of LTR identity comparisons between centromeric and non-centromeric CRM elements. Asterisks represent significant differences (two-tailed Wilcoxon rank-sum test, ****P ≤ 0.0001, ***P ≤ 0.001, **P ≤ 0.01, *P ≤ 0.05, ns: not significant). e Same as d but showing comparisons of LTR insertion time
Fig. 6
Fig. 6
Analysis of centromere reposition between P. alba var. pyramidalis and S. chaenomeloides. a Synteny analysis of homologous chromosomes between P. alba var. pyramidalis and S. chaenomeloides. The two straight lines in the middle represent homologous chromosomes scaled by chromosome length and dark grey boxes on lines represent corresponding centromeres. Grey lines connecting chromosomes represent homologous genes pairs between P. alba var. pyramidalis and S. chaenomeloides, and green lines represent homologous genes located within centromeres and peri-centromeric regions of P. alba var. pyramidalis. Histograms represent the distribution of each type of LTRs and LINE elements from P. alba var. pyramidalis and S. chaenomeloides. CUT&Tag data coverage from both S. chaenomeloides and P. alba var. pyramidalis to the P. alba var. pyramidalis genome are shown at the top. CUT&Tag data coverage from both S. chaenomeloides and P. alba var. pyramidalis to the S. chaenomeloides genome are shown at the bottom. b Boxplot of expression comparison between genes within repositioned centromeres in P. alba var. pyramidalis and genes within homologous regions in S. chaenomeloides, and vice versa. Asterisks represent significant differences (two-tailed Wilcoxon rank-sum test, ****P ≤ 0.0001, ***P ≤ 0.001, **P ≤ 0.01, *P ≤ 0.05, ns: not significant). c Metaprofiles of methylation levels for genes within repositioned centromeres and homologous regions. d Same as b but showing TAD length comparisons. e Same as c but showing metaprofiles of histone modification levels. Palv_cen: genes within repositioned centromeres in P. alba var. pyramidalis. Sch_col: genes within homolougs regions of Palv_cen in S. chaenomeloides. Sch_cen: genes within repositioned centromeres in S. chaenomeloides. Palv_col: genes within homologous regions of Sch_cen in P. alba var. pyramidalis

Similar articles

Cited by

References

    1. Kursel LE, Malik HS. Centromeres. Curr Biol. 2016;26:487–90. - PubMed
    1. McKinley KL, Cheeseman IM. The molecular basis for centromere identity and function. Nat Rev Mol Cell Biol. 2016;17:16–29. - PMC - PubMed
    1. Yan H, Kikuchi S, Neumann P, Zhang W, Wu Y, Chen F, et al. Genome-wide mapping of cytosine methylation revealed dynamic DNA methylation patterns associated with genes and centromeres in rice. Plant J. 2010;63:353–65. - PubMed
    1. Zhang W, Lee HR, Koo DH, Jiang J. Epigenetic modification of centromeric chromatin: hypomethylation of DNA sequences in the CENH3-associated chromatin in Arabidopsis thaliana and maize. Plant Cell. 2008;20:25–34. - PMC - PubMed
    1. Blower MD, Karpen GH. The role of Drosophila CID in kinetochore formation, cell-cycle progression and heterochromatin interactions. Nat Cell Biol. 2001;3:730–9. - PMC - PubMed

Grants and funding

LinkOut - more resources