Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 May 5:17:338.
doi: 10.1186/s12864-016-2675-5.

Improvements to the HITS-CLIP protocol eliminate widespread mispriming artifacts

Affiliations

Improvements to the HITS-CLIP protocol eliminate widespread mispriming artifacts

Austin E Gillen et al. BMC Genomics. .

Abstract

Background: High-throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP) allows for high resolution, genome-wide mapping of RNA-binding proteins. This methodology is frequently used to validate predicted targets of microRNA binding, as well as direct targets of other RNA-binding proteins. Hence, the accuracy and sensitivity of binding site identification is critical.

Results: We found that substantial mispriming during reverse transcription results in the overrepresentation of sequences complementary to the primer used for reverse transcription. Up to 45 % of peaks in publicly available HITS-CLIP libraries are attributable to this mispriming artifact, and the majority of libraries have detectable levels of mispriming. We also found that standard techniques for validating microRNA-target interactions fail to differentiate between artifactual peaks and physiologically relevant peaks.

Conclusions: Here, we present a modification to the HITS-CLIP protocol that effectively eliminates this artifact and improves the sensitivity and complexity of resulting libraries.

Keywords: CLIP-seq; HITS-CLIP; PAR-CLIP; iCLIP; microRNA.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Mispriming on genomic occurrences of the 3′ adaptor sequence produces an artifact in HITS-CLIP data. a Occurrences of the first six bases of the 3′ adaptor (allowing for one mismatch) in 200 bp windows around peak centers plotted using 20 bp sliding windows (with a 6 bp shift between each window) relative to the expected frequency of each adaptor-complement (calculated using 1 x 106 randomly sampled exonic sequences of 200 bp). Our early samples (emerald; 6 samples from 1 research group) show consistent overrepresentation of the adaptor sequence. This overrepresentation is also seen in a group of published samples (blue; 25 samples from 10 groups), while another group of published samples show underrepresentation of the adaptor sequence at the center of the peak (vermilion; 19 samples from 9 groups). The samples with the most extreme over- and underrepresentation are shown as dashed blue and vermilion lines, respectively. b Percentage of peaks containing the first six bases of the 3′ adaptor sequence (allowing for one mismatch) between positions −25 and +75 in each peak (highlighted in grey in A), minus the expected frequency (calculated using 1 x 106 randomly sampled exonic sequences of 200 bp). Groups are the same as in (a)
Fig. 2
Fig. 2
Identification of misprimed reads using a nested reverse transcription primer. a When the reverse transcription primer (blue) is flush with the 3′ adaptor, reads originating from 3′ adaptor (emerald) priming and mispriming (vermilion) are indistinguishable, as both are competent PCR templates. b However, when a nested reverse transcription primer (blue) is used along with a protected reverse PCR primer (bases with phosphorothioate bonds are shown in sky blue), only reads originating from 3′ adaptor (emerald) priming are valid templates for PCR amplification. Reverse transcription products derived from mispriming are not amplified in subsequent PCR steps, as they do not contain the final 3 bases of the reverse PCR primer. This 3-base mismatch prevents elongation of the primer by the polymerase, and the phosphorothioate bonds prevent the ‘chew-back’ of the primer by exonuclease activity of the polymerase
Fig. 3
Fig. 3
Removal of reads derived from mispriming events eliminates artifactual peaks. Occurrences of the first six bases of the 3′ adaptor (allowing for one mismatch) (a) or bases 4–9 of the 3′ adaptor (b) in 200 bp windows around peak centers plotted using 20 bp sliding windows (with a 6 bp shift between each window) relative to the expected frequency of each adaptor-complement (calculated using 1 x 106 randomly sampled exonic sequences of 200 bp). When using the original HITS-CLIP protocol (emerald), significant mispriming is observed. The use of a nested RT primer reduces the overrepresentation of the adaptor sequence (vermilion, a), but results in overrepresentation of the sequence complementary to the 3′ end of the RT primer (vermilion, b). Finally, using a nested RT primer and protected PCR primer results in much more even representation of both the adaptor sequence (blue, a) and RT primer complement (blue, b), with only modest overrepresentation of each sequence around the center of the peak
Fig. 4
Fig. 4
True peaks are indistinguishable from artifacts in most HITS-CLIP data. a. A CLIP peak in the Progesterone Receptor (PGR) 3′ UTR (emerald; original protocol) in MCF-7 cells is the result of mispriming on a perfect reverse-complementary match to the final 8-bases of the RT primer (highlighted in vermilion). This sequence is also complementary to miR-888, making the resulting peak appear to contain a miR-888 target. This peak disappears completely when the reads are filtered (sky blue) or a nested RT primer and protected PCR primer is used (blue; 0, 6 and 24 h after stimulation with estradiol). b. A robust CLIP peak in the c-Myc 3′ UTR is a bonafide target of miR-34b (seed complement highlighted in emerald) [–35]. This peaks is also present when the data are filtered (sky blue) and when a nested RT primer and protected PCR primer are used (blue). Figure based on output from http://genome.ucsc.edu (hg19 assembly) [42, 43]
Fig. 5
Fig. 5
miR-888 overexpression represses Progesterone Receptor expression. a. Luciferase reporter assays confirm functional binding of overexpressed miR-888-3p to PGR-MRE and disruption of this interaction upon mutation the seed-binding region. b. Western blot analysis of BT474 cells transfected with miR-888 or control (5 nM each). MiR-888-3p overexpression represses PGR expression as compared to cells transfected with control mimic. c Progesterone treatment increases the number of CK5 expressing cells in ER+ breast cancer cells (T47D are shown). MiR-888-3p overexpression effectively blocks the progesterone induced increase of CK5
Fig. 6
Fig. 6
A nested RT primer and protected reverse PCR primer produce high confidence AGO HITS-CLIP peaks and increase sensitivity of interaction detection. Percentage of peaks containing the reverse complement of the 6 bp miRNA seed site for at least one the 10 most highly expressed miRNAs in that sample (measured in the same CLIP experiment). The use of a nested RT primer and protected PCR primer (Blue; n = 3) results in a significant, 30 % increase in the percentage of peaks, with a top-10 miRNA seed site relative to the original protocol (emerald; n = 6), while the use of a nested RT primer alone (vermillion; n = 3) produces a 16 % increase. *p ≤ 0.05

References

    1. Ule J, Jensen K, Mele A, Darnell RB. CLIP: a method for identifying protein-RNA interaction sites in living cells. Methods. 2005;37(4):376–86. doi: 10.1016/j.ymeth.2005.07.018. - DOI - PubMed
    1. Ule J, Jensen KB, Ruggiu M, Mele A, Ule A, Darnell RB. CLIP identifies Nova-regulated RNA networks in the brain. Science. 2003;302(5648):1212–5. doi: 10.1126/science.1090095. - DOI - PubMed
    1. Chi SW, Zang JB, Mele A, Darnell RB. Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. Nature. 2009;460(7254):479–86. - PMC - PubMed
    1. Xue Y, Ouyang K, Huang J, Zhou Y, Ouyang H, Li H, et al. Direct conversion of fibroblasts to neurons by reprogramming PTB-regulated microRNA circuits. Cell. 2013;152(1-2):82–96. doi: 10.1016/j.cell.2012.11.045. - DOI - PMC - PubMed
    1. Nakaya T, Alexiou P, Maragkakis M, Chang A, Mourelatos Z. FUS regulates genes coding for RNA-binding proteins in neurons by binding to their highly conserved introns. RNA. 2013;19(4):498–509. doi: 10.1261/rna.037804.112. - DOI - PMC - PubMed

Publication types

MeSH terms