Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jun;27(6):1063-1073.
doi: 10.1101/gr.219394.116. Epub 2017 Mar 24.

RNA-DNA hybrid (R-loop) immunoprecipitation mapping: an analytical workflow to evaluate inherent biases

Affiliations

RNA-DNA hybrid (R-loop) immunoprecipitation mapping: an analytical workflow to evaluate inherent biases

László Halász et al. Genome Res. 2017 Jun.

Erratum in

Abstract

The impact of R-loops on the physiology and pathology of chromosomes has been demonstrated extensively by chromatin biology research. The progress in this field has been driven by technological advancement of R-loop mapping methods that largely relied on a single approach, DNA-RNA immunoprecipitation (DRIP). Most of the DRIP protocols use the experimental design that was developed by a few laboratories, without paying attention to the potential caveats that might affect the outcome of RNA-DNA hybrid mapping. To assess the accuracy and utility of this technology, we pursued an analytical approach to estimate inherent biases and errors in the DRIP protocol. By performing DRIP-sequencing, qPCR, and receiver operator characteristic (ROC) analysis, we tested the effect of formaldehyde fixation, cell lysis temperature, mode of genome fragmentation, and removal of free RNA on the efficacy of RNA-DNA hybrid detection and implemented workflows that were able to distinguish complex and weak DRIP signals in a noisy background with high confidence. We also show that some of the workflows perform poorly and generate random answers. Furthermore, we found that the most commonly used genome fragmentation method (restriction enzyme digestion) led to the overrepresentation of lengthy DRIP fragments over coding ORFs, and this bias was enhanced at the first exons. Biased genome sampling severely compromised mapping resolution and prevented the assignment of precise biological function to a significant fraction of R-loops. The revised workflow presented herein is established and optimized using objective ROC analyses and provides reproducible and highly specific RNA-DNA hybrid detection.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Experimental design: constructing DRIP schemes. (A) Experiments 1–16 explore the effect of formaldehyde-fixation (Step 1), nucleic acid isolation (Step 2), removal of free RNA (Step 3), and nucleic acid fragmentation (Step 4) on the outcome of RNA-DNA hybrid detection. Each experiment was performed at two parallel cell lysis temperatures (65°C and 37°C), respectively. The temperature variable is not depicted in the cartoon, but it is referred in the main text. (B) Experiments 17–24 test the impact of acoustic sharing performed on a chromatin prep rather than on naked nucleic acid, similarly to the ChIP protocol. Each experiment was performed at 65°C cell lysis temperature. (C) Workflow of a ChIP experiment (shown only for comparison with the DRIP pipeline). (HCHO) Formaldehyde fixation, (Phe/Chl) phenol-chloroform extraction, (Kit) silica membrane-based nucleic acid purification, (RNase A) Ribonuclease A digestion performed at high (300 mM) NaCl concentration, (Son) sonication, (RE) restriction enzyme cocktail digestion (HindIII, EcoRI, BsrGI, XbaI, and SspI). As a negative control, RNase H digestion was applied in all DRIP experiments (not indicated in the cartoon).
Figure 2.
Figure 2.
Summary of available human DRIP-seq experiments. (A) Bar chart showing the number of identified R-loop peaks in human Jurkat cells and naive T cells (this study). (B) Annotation of R-loop binding sites over functional genomic elements. DRIP-seq peaks were determined in Jurkat cells and naive T cells, and in other published cell types (NTERA2, K562, Fibroblast, MCF7, IMR90, HEK293T). The upper four rows represent DRIP experiments fragmenting the nucleic acid by sonication, while the lower five rows highlight restriction enzyme-digested DRIP samples. The difference between the two groups is especially noticeable over exons (associated to 14%–27% and 1%–3.5% of R-loops, respectively) and repeat elements (SINEs, LINEs, LTRs, simple and low complexity repeats) that involve 22%–38% and 54%–67% of the R-loop peaks, respectively. At other annotation categories (gene body, introns, and promoters), the difference was not significant between the two groups. (C) Density plots showing the distribution of R-loop peak sizes, classified by fragmentation method (restriction enzyme vs. sonication). Median peak length and 2.5%–97.5% quantiles are indicated. Peak length distributions differ significantly between the two fragmentation methods. (D) Heat map showing the overlap of R-loop binding sites between independent DRIP-seq experiments. Values and cell colors represent pairwise and unique overlap ratios between each peak set. The difference between the two nucleic acid fragmentation methods is clearly apparent, as peak sets from the same fragmentation process better resemble each other (highlighted in black).
Figure 3.
Figure 3.
Good DRIP practice. (A) Bar charts showing the distribution of AUC (area under the curve) values of ROC plots for 24 DRIP classifiers. Error bars represent the confidence interval of AUCs. High (>0.7) AUC values were obtained for 10 DRIP classifiers (exp. 5, 6, 13, 15, 17, 18, 19, 21, and 24). Low (∼0.5) AUC values were obtained in four DRIP experiments (exp. 2, 10, 11, and 16). We highlight these groups as “preferred” and “not preferred,” respectively. (B,C) The top four DRIP experiments ranked by AUCs (exp. 5, 13, 17, and 19). (B) DRIP-qPCR enrichment scores are displayed over the test regions. Horizontal dotted lines represent the cutoff value (calculated from the ROC curves) separating the true R-loop signal from background. (C) ROC curves of the top four experiments. (D) Paired-ROC plots, comparing the main variables (steps) of the DRIP experiments. The level of statistical significance was 0.05.
Figure 4.
Figure 4.
Analysis of restriction sites over genic and intergenic regions. (A) Restriction fragment lengths over genic regions (gene bodies, exons, first exons) are significantly larger compared to intergenic regions. The plot shows the difference of genic (observed) and intergenic (expected) fragment sizes in base pairs. The following enzymes were applied in combination: HindIII, EcoRI, BsrGI, XbaI, and SspI. (BD) The number of restriction sites over genic regions is significantly lower compared to intergenic regions. Colors indicate the proportion of cutting sites in each category. Red and blue slices, marking the rarest restriction site frequencies, are prevalent over genic elements in each pie chart. (E) Cutting efficiency of restriction enzymes applied in the indicated DRIP-seq experiments. Zero read: the restriction site was cut. Greater equal than one read: the restriction site was uncut in a fraction of cells. There were uncut reads (sites) over half of the theoretical restriction sites. The proportion of uncut reads was even higher within gene coding regions compared to intergenic regions. See the model of cutting efficiency in panel F.
Figure 5.
Figure 5.
Large restriction fragments over gene bodies cause uncertainty in the precise localization of R-loops, potentially impeding their functional annotation. (AC) Genome browser tracks showing three representative examples (MYC, BCL6, and VIM). Upper two tracks: restriction fragment-sized R-loops are prevalent over the 5′ end of genes, vastly exceeding the gene borders in the case of MYC. Lower two tracks: the precise genomic position of R-loops was resolved in the sonicated group of samples. Green boxes represent R-loop enriched regions predicted by the peak callers. Blue dashed lines represent cutting sites for restriction enzymes (HindIII, EcoRI, BsrGI, XbaI, and SspI).

Similar articles

Cited by

References

    1. Alzu A, Bermejo R, Begnis M, Lucca C, Piccini D, Carotenuto W, Saponaro M, Brambati A, Cocito A, Foiani M, et al. 2012. Senataxin associates with replication forks to protect fork integrity across RNA-polymerase-II-transcribed genes. Cell 151: 835–846. - PMC - PubMed
    1. Baranello L, Kouzine F, Sanford S, Levens D. 2016. ChIP bias as a function of cross-linking time. Chromosom Res 24: 175–181. - PMC - PubMed
    1. Beneke S, Meyer K, Holtz A, Hüttner K, Bürkle A. 2012. Chromatin composition is changed by poly(ADP-ribosyl)ation during chromatin immunoprecipitation. PLoS One 7: e32914. - PMC - PubMed
    1. Benore-Parsons M, Ayoub MA. 1997. Presence of RNase A causes aberrant DNA band shifts. Biotechniques 23: 128–131. - PubMed
    1. Bhatia V, Barroso SI, García-Rubio ML, Tumini E, Herrera-Moyano E, Aguilera A. 2014. BRCA2 prevents R-loop accumulation and associates with TREX-2 mRNA export factor PCID2. Nature 511: 362–365. - PubMed

Publication types

MeSH terms