. 2015 Nov 16:8:46.

doi: 10.1186/s13072-015-0040-6. eCollection 2015.

RNA:DNA hybrids in the human genome have distinctive nucleotide characteristics, chromatin composition, and transcriptional relationships

Affiliations

¹ Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461 USA.
² Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461 USA ; Department of Biology, Center for Genomics and Systems Biology, New York University, 12 Waverly Place, New York, NY 10003 USA.
³ Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461 USA ; Integrated Genomics Operation, Memorial Sloan-Kettering Cancer Center, New York, NY 10065 USA.
⁴ Roche-NimbleGen, Madison, WI 53711 USA.
⁵ School of Mathematics, Statistics and Applied Mathematics, National University of Ireland Galway, Galway, Ireland.
⁶ Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461 USA ; Department of Genetics, Center for Epigenomics and Division of Computational Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY 10461 USA.

PMID: 26579211
PMCID: PMC4647656
DOI: 10.1186/s13072-015-0040-6

RNA:DNA hybrids in the human genome have distinctive nucleotide characteristics, chromatin composition, and transcriptional relationships

Julie Nadel et al. Epigenetics Chromatin. 2015.

. 2015 Nov 16:8:46.

doi: 10.1186/s13072-015-0040-6. eCollection 2015.

Affiliations

¹ Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461 USA.
² Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461 USA ; Department of Biology, Center for Genomics and Systems Biology, New York University, 12 Waverly Place, New York, NY 10003 USA.
³ Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461 USA ; Integrated Genomics Operation, Memorial Sloan-Kettering Cancer Center, New York, NY 10065 USA.
⁴ Roche-NimbleGen, Madison, WI 53711 USA.
⁵ School of Mathematics, Statistics and Applied Mathematics, National University of Ireland Galway, Galway, Ireland.
⁶ Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461 USA ; Department of Genetics, Center for Epigenomics and Division of Computational Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY 10461 USA.

PMID: 26579211
PMCID: PMC4647656
DOI: 10.1186/s13072-015-0040-6

Abstract

Background: RNA:DNA hybrids represent a non-canonical nucleic acid structure that has been associated with a range of human diseases and potential transcriptional regulatory functions. Mapping of RNA:DNA hybrids in human cells reveals them to have a number of characteristics that give insights into their functions.

Results: We find RNA:DNA hybrids to occupy millions of base pairs in the human genome. A directional sequencing approach shows the RNA component of the RNA:DNA hybrid to be purine-rich, indicating a thermodynamic contribution to their in vivo stability. The RNA:DNA hybrids are enriched at loci with decreased DNA methylation and increased DNase hypersensitivity, and within larger domains with characteristics of heterochromatin formation, indicating potential transcriptional regulatory properties. Mass spectrometry studies of chromatin at RNA:DNA hybrids shows the presence of the ILF2 and ILF3 transcription factors, supporting a model of certain transcription factors binding preferentially to the RNA:DNA conformation.

Conclusions: Overall, there is little to indicate a dependence for RNA:DNA hybrids forming co-transcriptionally, with results from the ribosomal DNA repeat unit instead supporting the intriguing model of RNA generating these structures in trans. The results of the study indicate heterogeneous functions of these genomic elements and new insights into their formation and stability in vivo.

Keywords: Chromatin; DNA methylation; Mass spectrometry; Non-coding RNA; R-loop; RNA:DNA hybrid; Transcription; Transcription factor.

PubMed Disclaimer

Figures

**Fig. 1**
Subcellular localization studies. In panel a we show the results of hybridization of the fluorescently-labeled RDIP-seq library to a control male metaphase preparation. The RDIP-seq library is shown in *red*, a bacterial artificial chromosome (BAC) probe mapping to chromosome 9 in *green*, and DNA counterstained by DAPI in *blue*. We observe a specific strong signal from the RDIP-seq library mapping to the p arms of acrocentric chromosomes (HSA13-15 and HSA21-22), indicating enrichment at the nucleolar organizing regions (NORs) encoding ribosomal RNAs, and at the pericentromeric region of chromosome 9. In panel b we show the results of immunofluorescence using the S9.6 antibody (*green*) with an antibody to fibrillarin (*red*), demonstrating co-localization with the intranuclear S9.6 antibody signal (merge) and therefore enrichment in nucleoli. Further signal from the nuclear periphery and the cytoplasm using S9.6 is also observed, which may represent detection by this antibody of RNA conformations rather than RNA:DNA hybrids specifically [48]

**Fig. 2**
Mapping of RNA:DNA hybrids within the ribosomal DNA repeat unit. The *upper panel* shows the results of RDIP-seq (*gray*) and RNA-seq (*red*), with genomic annotations and results of ChIP-seq analysis in K562 cells [55] plotted below. RDIP-seq and RNA-seq data are both represented using a smoothed plot showing the number of reads aligned to each basepair of the repeating unit, while the ChIP-seq data signal intensity represents the mean value of non-overlapping 50 bp windows. RDIP-seq values were normalized by subtracting the frequencies of aligned reads of the input sample in each window. We find that RNA:DNA hybrids co-localize with the rRNA transcripts, but that there are also RDIP-seq peaks of comparable magnitude in the intergenic spacer (IGS) where no transcriptional activity is apparent from RNA-seq. The RNA:DNA hybrids in the IGS are upstream of the promoter region and flank the upstream candidate *cis*-regulatory sequence where there is H3K4 methylation and acetylation of H3K9 and H3K27

**Fig. 3**
Genomic distribution of RNA:DNA hybrids. In panel a we show that the proportion of reads mapping to rDNA is 2 %, and break down the remaining 98 % by genomic context, showing the majority of RNA:DNA hybrids (called as *peaks* using ChIP-seq analytical approaches) to be located in intergenic regions. To understand these RNA:DNA hybrid distributions, we calculated observed/expected ratios based on nucleotide occupancy of genomic features, and performed permutation analyses testing for the likelihood of randomized intersection (b), the results of which are shown in Additional file 2: Table S1. We found depletion of RNA:DNA hybrids at RefSeq gene bodies, intergenic regions, and SINE and DNA transposable elements but significant enrichment at promoters and CpG islands, and a number of purine-rich repetitive sequences

**Fig. 4**
Nucleotide skewing analyses. In panel a we plot the skewing within a strand of A compared to T (x axis) or G compared to C (y axis) in the RNA:DNA hybrid peaks genome-wide. We find that the peaks are strongly over-represented for purine (G+A) and pyrimidine (C+T) skewing. As our sequencing approach allowed us to identify the RNA and DNA-derived strands separately in the RNA:DNA hybrid, in b we proceeded to test whether there was a relationship between skewing (based on the number of G+A divided by the total number of nucleotides) and each type of nucleic acid-derived sequence, finding a clear enrichment for purine skewing on the RNA-derived strand

**Fig. 5**
Transcriptional relationships of RNA:DNA hybrids. In a the proportion of RNA:DNA hybrid peaks in transcribed genes is shown to be higher than in non-transcribed genes, but that the majority of genes do not contain RNA:DNA hybrids. In b a metaplot of RNA:DNA hybrid peaks is shown, illustrating the number of peaks intersecting with 100 bp windows, with the RNA of the hybrid on the transcribed strand of the gene (*red*) or the opposite strand (*blue*). This revealed an enrichment of the RNA-derived sequence on the transcribed strand in the first ~1.5 kb downstream from the transcription start site (TSS). A depletion of RNA:DNA hybrids is found at the transcription end site (TES). In c we show that the region immediately downstream from the TSS is purine-skewed, represented by skewing values of 100 bp windows averaged for all genes, but that this is to the same degree in genes that form RNA:DNA hybrids (*blue*) as those genes that do not form these structures (*red*). In d a metaplot of RefSeq genes (*left*) shows that the transcription level of genes (as measured by RNA-seq) is positively associated with the number of RN:DNA hybrids intersecting with 100 bp windows immediately downstream of the TSS. This reflects only modest increases in the small proportions of genes forming peaks (*right*), though found to be a significant relationship using a proportions test

**Fig. 6**
Macro-scale genomic associations of RNA:DNA hybrids. We used a least absolute shrinkage and selection operator (LASSO) adaptive regression approach to explore the association of genomic sequence features with RNA:DNA hybrid density in 500 kb windows. The figure shows the order in which covariates enter the model as the constraint on the sum of the regression coefficients (x axis) is progressively relaxed from 0 to its maximum value (corresponding to the ordinary least squares regression vector)

**Fig. 7**
Chromatin organizational studies at RNA:DNA hybrids using mass spectrometry. In panel a we show the experimental approach used for these proteomic studies. In b the altered pattern of enriched proteins compared with the input sample is seen using gel electrophoresis, and the results of Western blots confirming the enrichment of specific candidate proteins identified by mass spectrometry (ILF2, ILF3, hnRNP C1/C2), with SP1 and SP3 as controls known to bind to G-skewed DNA motifs

See this image and copyright information in PMC

Cited by

Ribonuclease H2 Subunit A Preserves Genomic Integrity and Promotes Prostate Cancer Progression.
Kimura N, Takayama KI, Yamada Y, Kume H, Fujimura T, Inoue S. Kimura N, et al. Cancer Res Commun. 2022 Aug 25;2(8):870-883. doi: 10.1158/2767-9764.CRC-22-0126. eCollection 2022 Aug. Cancer Res Commun. 2022. PMID: 36923313 Free PMC article.
Spliceosomal components protect embryonic neurons from R-loop-mediated DNA damage and apoptosis.
Sorrells S, Nik S, Casey MJ, Cameron RC, Truong H, Toruno C, Gulfo M, Lowe A, Jette C, Stewart RA, Bowman TV. Sorrells S, et al. Dis Model Mech. 2018 Feb 26;11(2):dmm031583. doi: 10.1242/dmm.031583. Dis Model Mech. 2018. PMID: 29419415 Free PMC article.
Senataxin resolves RNA:DNA hybrids forming at DNA double-strand breaks to prevent translocations.
Cohen S, Puget N, Lin YL, Clouaire T, Aguirrebengoa M, Rocher V, Pasero P, Canitrot Y, Legube G. Cohen S, et al. Nat Commun. 2018 Feb 7;9(1):533. doi: 10.1038/s41467-018-02894-w. Nat Commun. 2018. PMID: 29416069 Free PMC article.
T helper cell-mediated epitranscriptomic regulation via m6A RNA methylation bridges link between coronary artery disease and invasive ductal carcinoma.
Rakshit S, Sunny JS, George M, Hanna LE, Leela KV, Sarkar K. Rakshit S, et al. J Cancer Res Clin Oncol. 2022 Dec;148(12):3421-3436. doi: 10.1007/s00432-022-04130-x. Epub 2022 Jul 1. J Cancer Res Clin Oncol. 2022. PMID: 35776197 Free PMC article.
Hot spots of DNA double-strand breaks in human rDNA units are produced in vivo.
Tchurikov NA, Yudkin DV, Gorbacheva MA, Kulemzina AI, Grischenko IV, Fedoseeva DM, Sosin DV, Kravatsky YV, Kretova OV. Tchurikov NA, et al. Sci Rep. 2016 May 10;6:25866. doi: 10.1038/srep25866. Sci Rep. 2016. PMID: 27160357 Free PMC article.

See all "Cited by" articles

References

1. Wang J, Zhuang J, Iyer S, Lin X, Whitfield TW, Greven MC, Pierce BG, Dong X, Kundaje A, Cheng Y, et al. Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome Res. 2012;22:1798–1812. doi: 10.1101/gr.139105.112. - DOI - PMC - PubMed
1. Neph S, Vierstra J, Stergachis AB, Reynolds AP, Haugen E, Vernot B, Thurman RE, John S, Sandstrom R, Johnson AK, et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature. 2012;489:83–90. doi: 10.1038/nature11212. - DOI - PMC - PubMed
1. Natarajan A, Yardimci GG, Sheffield NC, Crawford GE, Ohler U. Predicting cell-type-specific gene expression from regions of open chromatin. Genome Res. 2012;22:1711–1722. doi: 10.1101/gr.135129.111. - DOI - PMC - PubMed
1. Yip KY, Cheng C, Bhardwaj N, Brown JB, Leng J, Kundaje A, Rozowsky J, Birney E, Bickel P, Snyder M, Gerstein M. Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription-related factors. Genome Biol. 2012;13:R48. doi: 10.1186/gb-2012-13-9-r48. - DOI - PMC - PubMed
1. Hu S, Wan J, Su Y, Song Q, Zeng Y, Nguyen HN, Shin J, Cox E, Rho HS, Woodard C, et al. DNA methylation presents distinct binding sites for human transcription factors. eLife. 2013;2:e00726. - PMC - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

RNA:DNA hybrids in the human genome have distinctive nucleotide characteristics, chromatin composition, and transcriptional relationships

Affiliations

RNA:DNA hybrids in the human genome have distinctive nucleotide characteristics, chromatin composition, and transcriptional relationships

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Miscellaneous

Abstract

Figures

Similar articles

Cited by

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Miscellaneous