Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Sep 1:2025.08.28.672650.
doi: 10.1101/2025.08.28.672650.

An RNA polymerase III tissue and tumor atlas uncovers context-specific activities linked to 3D epigenome regulatory mechanisms

Affiliations

An RNA polymerase III tissue and tumor atlas uncovers context-specific activities linked to 3D epigenome regulatory mechanisms

Simon Lizarazo et al. bioRxiv. .

Abstract

RNA polymerase III (Pol III) produces a plethora of small noncoding RNA species involved in diverse cellular processes, from transcription regulation and splicing to RNA stability, translation, and proteostasis. Though Pol III activity is broadly coupled with cellular demands for protein synthesis and growth, a more precise understanding of gene-level dynamics and context-specific expression patterns remains missing, in part due to challenges related to sequencing and mapping Pol III-derived small ncRNAs. Here, we establish a predictive multi-tissue map of human Pol III activity across 19 tissues and 23 primary cancer subtypes by comprehensively profiling the chromatin accessibility of canonical Pol III-transcribed gene classes. Our framework relies on the unique relationship between gene accessibility and Pol III transcription, inferring activity through uniform binary classification of ATAC-seq enrichment at Pol III-transcribed genes. By characterizing multi-context gene uniformity, we provide a definition of the core Pol III transcriptome, broadly active across specialized tissues, and catalog genes with varied levels of context specificity. Our genomic Pol III atlas uncovers variable levels of activity across tissues, including sharp contraction of the Pol III transcriptome in heart and brain tissues and frequent expansion across diverse cancers. We show that both tissue- and tumor-specific genes are significantly enriched within lamina-associated domains (LADs), and that aberrant expression of nuclear lamin proteins is sufficient to induce Pol III-emergent patterns at tumor-specific genes. Together, these findings link Pol III dynamics to subnuclear compartmentalization and provide a resource for better understanding Pol III expansion and small RNA biogenesis in cancer.

PubMed Disclaimer

Conflict of interest statement

Competing interest statement The authors declare no competing interests

Figures

Figure 1.
Figure 1.. ATAC-measured gene accessibility scales with Pol III transcription signatures and recovers tissue-specific Pol III activity patterns.
(a) Categorical overview of the small RNA species encoded by Pol III-transcribed genes and included in our large-scale genomic survey. (b) Analysis of chromatin accessibility (ATAC-seq) across Pol III-transcribed genes (shown in a), when binned by Pol III occupancy (ChIP-seq). (c) Illustrative overview of the gene “on” / gene “off” classification method used in this study. ATAC-measured gene accessibility is compared against a gene-specific local chromatin background (lambda); genes are inferred to be permissive and transcriptionally active (“on”) if accessibility signals are significant. (d) Visualization of tissue-specificity scores, measured by Shannon’s entropy (see methods). Individual data points represents Pol III-transcribed genes; radially positioned genes are characterized as context-specific. (e) ATAC-seq signal pileup at specific examples of Pol III-transcribed genes defined as shared (‘housekeeping’) or tissue-specific (‘brain’, ‘colon’, ‘heart’, ‘liver’, ‘lung’). nRPKM = normalized read per kilobase per million mapped reads. (f-g) Pol III occupancy at brain-specific and heart-specific genes (related to panel e). Pol III ChIP-seq data represent large subunit POLR3A (RPC1) binding in induced pluripotent stem cells (hiPSCs), hippocampal neural precursors, and mature neurons (f); or in human embryonic stem cells (hESCs) and cardiomyocytes (CM) (g).
Figure 2.
Figure 2.. Tissue-level Pol III signatures include variable levels of restriction and expansion involving specific classes of Pol III-transcribed genes.
(a) Hierarchical clustering of 134 sample-specific Pol III gene activity signatures. Individual experiments are categorically colored by human tissue, with the heart, brain, colon, liver, and lung highlighted for clarity. (b) Overview of the variable breadth and number of tissue-specific Pol III-transcribed gene features defined in the heart, brain, colon, liver, and lung tissues. (c) Visualization of Pol III-transcribed genes classified as having high, intermediate, or low uniformity and ensuing activity patterns within heart, brain, colon, liver, and lung tissues. Columns reflect individual experiments; tissue colors indicate “on” state, white color indicates “off” state. (d) Hierarchical subclustering of neuronal and nonneuronal samples derived from human brain tissues (related to panels a and d). (e) Heatmap visualization of 212 active Pol III-transcribed genes in human brain tissues in neuronal (light blue; top) and nonneuronal experiments (dark blue; bottom), subdivided by multi-tissue classification of high, intermediate, or low gene uniformity (related to panels b and c). (f) Composition of high, intermediate, and low uniformity subclassifications on the basis of annotated Pol III-transcribed gene subcategories. Inset serves to highlight these subcategories, in juxtaposition to the expansive number of annotated but transcriptionally inactive instances of various Pol III-transcribed gene classes (bottom barplot).
Figure 3.
Figure 3.. Sequence-based prediction of tRNA gene activity patterns recovers A- and B-box promoter elements, as well as uncharacterized gene proximal elements.
(a) Predictive power of sequences enriched at tRNA genes for classifying active versus inactive tRNA genes. Motifs - discovered de novo by STREME - are ranked by Random Forest importance score. A 15-mer sequence element with maximum importance and significant overlap with previously reported B-box elements is highlighted. Sequences with significant B-box (blue) and A-box (red) similarity are further positionally mapped within active tRNA genes (inset). (b) Analogous plot of sequence elements ranked in panel a, but showing Random Forest importance score for predicting high or intermediate uniformity versus low uniformity (i.e. tissue-specific) tRNA genes. A 14-mer sequence element with maximum importance and significant similarity to previously reported A-box elements is highlighted). (c) Positional frequency histograms of A-box and B-box-like elements in active tRNA genes. (d) Empirical human A- and B-box sequence elements derived by positional frequencies shown in panel c. (e) Uncharacterized predictive sequences PS1 and PS2 shown to have high importance in classifying tRNA active versus inactive state. (f) Individual frequencies of predictive sequence elements within active and inactive tRNA genes. In contrast to other features, PS1 and PS2 occur at higher frequency in inactive genes. (g) Positional map of PS1 (gold) and PS2 (purple) occurrences within active and inactive tRNA genes.
Figure 4.
Figure 4.. Epigenome-based prediction of Pol III activity uncovers strong enrichment of context-restricted ncRNA genes within Lamina-associated domains (LADs) and peripheral subcompartments.
(a) Illustrative overview of the epigenome-based survey for features that predict Pol III gene activity. Random Forest analysis incorporated 147 features including chromatin accessibility (ATAC), Transcription Factor (TF)-binding, histone modifications (ChIP-seq), Lamin association (DamID) and DNA methylation (Bisulfite-seq) profiles retrieved from ChIPatlas and 4DN. In brief, intra-factor gene-level statistics spanning 1 kb gene-centered intervals were normalized across all Pol III-transcribed genes. Multivariate class prediction was then evaluated over 500 training-test splits (75 v. 25). (b) Classification model performance; Mean AUC (Area Under Curve) = 0.86; response variable randomized AUC = 0.51. (c) Ranked importance of epigenomic features for classifying Pol III-transcribed gene state across all human tissues (MDA = mean decrease accuracy). Only the top 20 features are shown for clarity; Normalized summary score represents average gene-level signal statistic across individual gene groups. (d) Comparison of feature signal enrichment at high and intermediate (H.I.) uniform genes versus low (L) uniformity genes. Barplot (top) depicts the relative enrichment for genes with higher-than-median signal. Heatmap (bottom) visualizes normalized signal scores at 10 bp resolution (bin size) for individual genes in each subgroup. (e) Heatmap of feature enrichment z-scores at high, intermediate, and low uniformity genes; only features with significant enrichment (z > 2 in one or more groups) are shown for clarity. (f) Illustrative overview of LAD frequency calculation, which takes into account all LAD (4DN Lamin DamID-based) calls and computes genomic cross-experiment frequency. (g) Observation and empirical null distributions for average LAD frequencies spanning high, intermediate, and low uniformity genes. (h) Analogous analysis corresponding to radial subnuclear positions mapped by GPS-seq.
Figure 5.
Figure 5.. Pol III-transcribed gene patterns in cancer are dominated by tumor-specific signatures enriched within LADs.
(a) Visualization of tumor specificity scores measured by Shannon’s entropy. Individual data points represent Pol III-transcribed genes; radially positioned genes are characterized as tumor-specific. (b) ATAC-seq signal pileup at specific examples of Pol III-transcribed genes defined as tumor-specific. (GBM = Glioblastoma multiforme; LGG = Brain Lower Grade Glioma; COAD = Colon adenocarcinoma; LIHC = Liver hepatocellular carcinoma; LUAD = Lung adenocarcinoma; LUSC = Lung squamous cell carcinoma). (c) Number of genes predicted to be active between 1 tumor subtype and all 23 tumor types. Stacked barchart denotes gene tissue uniformity (gray, black) and tumor-emergent “new” genes (red). (d) Composition of tumor-emergent genes by Pol III-transcribed gene subclass. (e) Radial barchart depicting the number of predicted Pol III-transcribed genes active in GBM, LGG, COAD, LIHC, LUAD, and LUSC, compared against relevant tissue data (brain, colon, liver, lung, respectively). (f) LAD (Lamina-associated domain) frequencies for tumor-emergent ‘new’ genes compared against an empirical expectation, accounting for the total new genes in each cancer subtype. (g) Neighborhood analysis of local expression patterns in tumors compared against randomized samples; tumors with significantly higher or lower local expression within a window size of +/− 1.5 Mb (centered on tumor-specific genes) are colored; permutation p-value.
Figure 6.
Figure 6.. Aberrant lamin expression, an unfavorable hallmark in cancer, leads to expansion of Pol III-transcribed gene targets.
(a) Distribution of median pan-cancer survival z-scores across 1,498 unique HGNC (Hugo Gene Name Consortium) gene groups. High z-scores are linked with unfavorable outcomes in cancer (tcga-survival.com). (b) Immunoblot analysis of lamin protein abundance in HEK293 cells following overexpression of either GFP (control), lamin A (LMNA), lamin B1 (LMNB1), or lamin B2 (LMNB2). (c) Integrative analysis of gene accessibility (ATAC-seq) and ncRNA abundance (small RNA-seq) in HEK293 cells. (d) Number of genes predicted to be active in HEK293 overexpression experiments, stratified by subcategories defined in the multi-tissue and tumor atlases. H (High), I (Intermediate), L (Low), New (tumor active), Off (active in HEK293, inactive in tissues; tumors). Y-axis depicts the proportional change in total active genes. (e-f) Distribution of gene accessibility (e) and RNA abundance (f) at Pol III-transcribed genes, stratified by gene groups defined in panel d. TPM = transcript per million. (g) Gene ontology enrichment analysis for oncogenic gene signatures (MSigDB) at New genes located within intervals of high or low LAD frequency. (h) Model illustration: Lamin-associated domains and cancer-related lamin dysfunction are linked with tissue- and tumor-specific Pol III patterns.

References

    1. Dieci G., Fiorino G., Castelnuovo M., Teichmann M., and Pagano A. (2007). The expanding RNA polymerase III transcriptome. Trends Genet 23, 614–622. 10.1016/j.tig.2007.09.001. - DOI - PubMed
    1. Kessler A.C., and Maraia R.J. (2021). The nuclear and cytoplasmic activities of RNA polymerase III, and an evolving transcriptome for surveillance. Nucleic Acids Res 49, 12017–12034. 10.1093/nar/gkab1145. - DOI - PMC - PubMed
    1. Cieśla M., Towpik J., Graczyk D., Oficjalska-Pham D., Harismendy O., Suleau A., Balicki K., Conesa C., Lefebvre O., and Boguta M. (2007). Maf1 is involved in coupling carbon metabolism to RNA polymerase III transcription. Mol Cell Biol 27, 7693–7702. 10.1128/MCB.01051-07. - DOI - PMC - PubMed
    1. Goodfellow S.J., Innes F., Derblay L.E., MacLellan W.R., Scott P.H., and White R.J. (2006). Regulation of RNA polymerase III transcription during hypertrophic growth. EMBO J 25, 1522–1533. 10.1038/sj.emboj.7601040. - DOI - PMC - PubMed
    1. Roberts D.N., Stewart A.J., Huff J.T., and Cairns B.R. (2003). The RNA polymerase III transcriptome revealed by genome-wide localization and activity-occupancy relationships. Proc Natl Acad Sci U S A 100, 14695–14700. 10.1073/pnas.2435566100. - DOI - PMC - PubMed

Publication types

LinkOut - more resources