Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Jun 8:2025.06.05.658139.
doi: 10.1101/2025.06.05.658139.

Identification of chromatin-associated RNAs at human centromeres

Affiliations

Identification of chromatin-associated RNAs at human centromeres

Kelsey Fryer et al. bioRxiv. .

Abstract

Centromeres are a specialized chromatin domain that are required for the assembly of the mitotic kinetochore and the accurate segregation of chromosomes. Non-coding RNAs play essential roles in regulating genome organization including at the unique chromatin environment present at human centromeres. We performed Chromatin-Associated RNA sequencing (ChAR-seq) in three different human cell lines to identify and map RNAs associated with centromeric chromatin. Centromere enriched RNAs display distinct contact behaviors across repeat arrays and generally belong to three categories: centromere encoded, nucleolar localized, and highly abundant, broad-binding RNAs. Most centromere encoded RNAs remain locally associated with their transcription locus with the exception of a subset of human satellite RNAs. This work provides a comprehensive identification of centromere bound RNAs that may regulate the organization and activity of the centromere.

PubMed Disclaimer

Conflict of interest statement

Declarations of interest The authors declare no competing interests.

Figures

Figure 1:
Figure 1:. CASK annotation of ChAR-seq identifies centromeric RNA-DNA contacts
A. Schematic of ChAR-seq bridge molecule (top) and protocol (bottom). B. Schematic of Classification of Ambivalent Sequences (CASK), k-mer databases were generated from T2T assembled centromeric repeat sequences and intersected to create groups of k-mers present in each repeat type or combination of repeat types. Reads were then classified into repeat types of ambivalence groups containing multiple repeat types based on their k-mer content. C&D. Log of C. DNA or D. RNA ChAR-seq reads classified by CASK in K562 cells for each repeat domain normalized by the number of Dpnll sites in the domain per million reads vs. length of repeat domain in Mbp. Reads classified as rDNA regions were excluded. E. Log of ratio of RNA reads to DNA reads classified as each domain normalized by number of Dpnll sites in the domain vs. length of domain in Mbp. Reads classified as rDNA regions were excluded. F. Proportion of RNA reads enriched at centromeric (purple) or non-centromeric (gray) domains belonging to each RNA type in K562 cells (top), ES cells (middle), and DE cells (bottom).
Figure 2:
Figure 2:. Nucleolar RNAs enriched at centromeric repeats
A-C. Heatmap of R scores for nucleolar RNAs (x-axis) enriched at centromeric repeat domains (y axis) in A. K562 cells, B. DE cells, and C. ES cells. DNA side reads were classified into centromeric repeat domains using CASK and are denoted on x-axis with domain type label and chromosome. RNA reads were classified by genomic alignment and denoted with gene name and the chromosome they originate from. Darker color indicates contacts with higher R scores (larger difference between observed number of reads and predicted) indicating stronger enrichment above background contacts. Mitochondrial RNAs, ribosomal RNAs, ribosomal DNA, reads that could not be uniquely assigned to a single repeat type, and all contacts with fewer than 0.2CPM (counts per million total reads) were excluded from heatmap.
Figure 3:
Figure 3:. Non-nucleolar localized RNAs enriched at centromeric DNA
A-C. Heatmap of R scores for other exon derived RNAs (x-axis) enriched at centromeric repeat domains (y axis) in A. K562 cells, B. DE cells, and C. ES cells. DNA side reads were classified into centromeric repeat domains using CASK and are denoted on x-axis with domain type label and chromosome. RNA reads were classified by genomic alignment and denoted on y-axis with gene type (left most label), gene name (middle label), and the chromosome they originate from (right label). Darker color indicates contacts with higher R scores (larger difference between observed number of reads and predicted) indicating stronger enrichment above background contacts. Mitochondrial RNAs, ribosomal RNAs, ribosomal DNA, reads that could not be uniquely assigned to a single repeat type, and all contacts with fewer than 0.2CPM were excluded from heatmap.
Figure 4:
Figure 4:. Centromere derived RNAs enriched at centromeric DNA
A-D. Heatmap of R scores for RNAs derived from centromere and centromere transition (CT) regions (y-axis) enriched at centromere/CT domains (x-axis) in K562 cells. DNA and RNA reads were classified using CASK and denoted with domain type label and chromosome. Darker color indicates contacts with higher R scores (larger difference between observed number of reads and predicted) indicating stronger enrichment above background contacts. Contacts with R scores below the 95th percentile were considered not significantly enriched and were excluded from heatmaps. Reads classified as ribosomal on either RNA or DNA side, reads that could not be uniquely assigned to a single repeat type, and all contacts with fewer than 0.1 CPM were excluded from heatmap. A. Class level assigned centromere derived RNAs enriched at centromere DNA domains B. Repeat type level HOR derived RNAs enriched at HOR DNA domains. Green label corresponds to T2T active HOR annotation denoting CENP-A containing HOR. Red, inactive label denotes HOR that does not contain CENP-A.C. Centromere/CT derived RNAs (y-axis) enriched at CT regions (x-axis). D. CT derived RNAs (y-axis) enriched at centromeric domains (x-axis).

Similar articles

References

    1. Earnshaw W.C., and Rothfield N. (1985). Identification of a family of human centromere proteins using autoimmune sera from patients with scleroderma. Chromosoma 91, 313–321. 10.1007/BF00328227. - DOI - PubMed
    1. Sullivan B.A., and Karpen G.H. (2004). Centromeric chromatin exhibits a histone modification pattern that is distinct from both euchromatin and heterochromatin. Nat Struct Mol Biol 11, 1076–1083. 10.1038/nsmb845. - DOI - PMC - PubMed
    1. Sullivan B.A., and Karpen G.H. (2004). Centromeric chromatin exhibits a histone modification pattern that is distinct from both euchromatin and heterochromatin. Nat Struct Mol Biol 11, 1076–1083. 10.1038/nsmb845. - DOI - PMC - PubMed
    1. Manuelidis L. (1978). Chromosomal localization of complex and simple repeated human DNAs. Chromosoma 66, 23–32. 10.1007/BF00285813. - DOI - PubMed
    1. Gershman A., Sauria M.E.G., Guitart X., Vollger M.R., Hook P.W., Hoyt S.J., Jain M., Shumate A., Razaghi R., Koren S., et al. (2022). Epigenetic patterns in a complete human genome. Science 376, eabj5089. 10.1126/science.abj5089. - DOI - PMC - PubMed

Publication types

LinkOut - more resources