Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Dec 9;44(5):828-40.
doi: 10.1016/j.molcel.2011.11.009.

In vivo and transcriptome-wide identification of RNA binding protein target sites

Affiliations

In vivo and transcriptome-wide identification of RNA binding protein target sites

Anna-Carina Jungkamp et al. Mol Cell. .

Abstract

Animal mRNAs are regulated by hundreds of RNA binding proteins (RBPs). The identification of RBP targets is crucial for understanding their function. A recent method, PAR-CLIP, uses photoreactive nucleosides to crosslink RBPs to target RNAs in cells prior to immunoprecipitation. Here, we establish iPAR-CLIP (in vivo PAR-CLIP) to determine, at nucleotide resolution, transcriptome-wide binding sites of GLD-1, a conserved, germline-specific translational repressor in C. elegans. We identified 439 reproducible target mRNAs and demonstrate an excellent dynamic range of target detection by iPAR-CLIP. Upon GLD-1 knockdown, protein but not mRNA expression of the 439 targets was specifically upregulated, demonstrating functionality. Finally, we discovered strongly conserved GLD-1 binding sites near the start codon of target genes. These sites are functional in vitro and likely confer strong repression in vivo. We propose that GLD-1 interacts with the translation machinery near the start codon, a so-far-unknown mode of gene regulation in eukaryotes.

PubMed Disclaimer

Figures

Figure 1
Figure 1. in vivo PAR-CLIP
(a) iPAR-CLIP methodology. Photoreactive nucleosides (4-thiouridine) were added to C. elegans L1 larvae in liquid culture. Alternatively, 6-thioguanosine (6SG) can be used. Adult worms expressing a GLD-1::GFP::FLAG fusion protein in the germline (marked in green) were irradiated with UV light (365nm). After immunoprecipitation of GLD-1 crosslinked RNA was partially digested and radiolabeled. Protein-RNA-complexes were size-separated on a denaturing gel. After gel elution, proteins were digested by Proteinase K and RNA was converted into a cDNA library for next generation sequencing. Binding sites of RNA binding proteins can be identified by a high incidence of conversions (T to C when using 4SU) in the sequence reads. (b) 4SU incorporation rates after labeling of worms with different concentrations of 4SU were measured by LC-MS/MS. For comparison, incorporation of 4SU in HEK293 cells labeled as described in (Hafner et al., 2010) is shown. Error bars due to technical variability range from the lowest to the highest incorporation rates measured in one sample. (c) PCRs showing labeling of transcripts specifically expressed in muscle (myo-2, myo-3), germline (oma-1, pie-1), intestine (elt-2), neurons (unc-8) or in a few cells including the distal tip cell (lag-2). Labeled: Total RNA of 4SU-labeled worms was biotinylated and isolated using streptavidin-beads. Unlabeled: same procedure using non 4SU labeled worms. (d) Adult worms labeled with different concentrations of 4SU were UV-irradiated with different doses of UV 365nm light. Dead and surviving worms were counted. (e) Phosphorimage of SDS-gel resolving 5′-32P-labeled RNA crosslinked to GLD-1::GFP::FLAG immunoprecipitates. IPs were prepared from lysates of worms grown in the presence or absence of 4SU and crosslinked with UV 365 nm light. For comparison, samples prepared from non-irradiated worms were included. Lower panel: immunoblot with anti-GFP antibody (loading control).
Figure 2
Figure 2. Identification of GLD-1 target sites by iPAR-CLIP
(a) Sequence reads obtained by iPAR-CLIP were mapped to C. elegans mRNAs and, after removing redundant reads, organized into sequence read clusters (GLD-1 binding motif highlighted in green, T to C conversions in red, T deletions in orange). (b) Length distribution of the identified sequence read clusters. In the sequence read clusters, T to C mutations (c) and T deletions (d) were ~10 fold enriched over other types of mutations or deletions. (e) GLD-1 binding motif (p value < 10−250) identified in the top 100 clusters using MEME.
Figure 3
Figure 3. iPAR-CLIP is highly reproducible
(a) 75% of the target genes and 66% of the nucleotides identified in iPAR-CLIP sequence read clusters could be reproduced in a biological replicate. Technical reproducibility on gene and nucleotide level is at least 86% and 69%, respectively. (b) Three 4SU iPAR-CLIPs and one 6SG iPAR-CLIP overlap in 439 targets, including 16 of 18 known GLD-1 targets. (c) For the 439 targets, 87% of the identified clusters were mapped to 3′UTRs, 7% to 5′UTRs. (d) Reproduced target sites, that were identified in all four iPAR-CLIP experiments, but not non-reproduced sites (identified in one iPAR-CLIP) are highly likely to be conserved (see Supplemental Methods). Error bars represent the standard deviation computed from binomial statistics and reflect the uncertainty due to the limited number of data points. (e) Reproduced, but not non-reproduced target sites are highly likely to contain strong GLD-1 binding motifs (see Supplemental Methods). (f) Number of T conversions in known GLD-1 targets in two biological replicates, normalized by the number of sequence reads. (g) Each cluster was tested for the presence of a GLD-1 binding motif and assigned a score. Plotted is the probability to obtain an equally high or higher score in a random cluster of same length and similar dinucleotide composition for the 630 identified clusters. 39% of the 630 identified clusters can be explained singly by the presence of GLD-1 motifs over background (dashed line).
Figure 4
Figure 4. GLD-1 targets are expressed in the germline and distributed over a wide range of expression levels
(a) mRNAs from wildtype worms and mutant worms that lack a germline were sequenced (RNA-seq). RPKM values reflect mRNA abundances (see Methods). The vast majority of the 439 reproducible GLD-1 targets (red) are expressed in the germline. GLD-1 target genes which are supported by at least one iPAR-CLIP replicate but not by all four (black) have less distinct germline expression. (b) RPKMs from mutant worms (glp-4) were weighted to account for the different proportions of somatic cells in the different samples and subtracted from wildtype RPKMs (Supplemental Methods). The resulting RPKMs quantify, if positive and significant (p-value < 0.01), expression levels in the germline (Supplemental Methods). Upper panel: histogram of germline expression levels (log units) for all genes with confident fold changes. Lower panel: For the 313 iPAR-CLIP targets for which we can confidently quantify germline expression, the detection range spans four orders of magnitude in expression levels and is only slightly shifted towards higher RPKM values compared to background expression. Red line: median (c) Soma expression levels were computed analogously (right panels). Only 14 out of 439 iPAR-CLIP targets were confidently quantified as exclusively expressed in somatic tissues and not expressed in the germline at detectable levels.
Figure 5
Figure 5. Identified iPAR-CLIP targets are functional in vivo
(a, c, and d) show cumulative fractions of fold-changes in protein expression after GLD-1 knockdown. Protein fold changes for altogether 3,874 genes were measured by using SILAC in C. elegans. (a) Out of the 439, protein fold changes for 202 germline-expressed, reproducible iPAR-CLIP targets were obtained. Upon GLD-1 knockdown, reproducible iPAR-CLIP targets show a highly significant shift (p value < 9 · 10−5) towards higher protein expression levels compared to all genes expressed in the germline. Changes in protein abundance averaged over targets that contain 3′UTR sites, targets that contain 5′UTR sites and targets that contain both 3′UTR and 5′UTR sites are shown. (b) RT-qPCRs for 14 targets that change on protein level upon GLD-1 knockdown (same samples as in (a)). RT-qPCRs for gld-1 itself and 2 negative controls (ama-1, act-1) were included. (c) In contrast to the reproducible set of 439 iPAR-CLIP targets, targets identified in only one of four iPAR-CLIP experiments are not de-repressed upon GLD-1 knockdown. (d) Changes in protein abundance upon GLD-1 knockdown averaged over all 2,127 putative targets identified in one iPAR-CLIP replicate.
Figure 6
Figure 6. Identified iPAR-CLIP binding sites are functional in vivo
(a,d) Reporter constructs containing a gld-1 promoter fused to GFP::Histone 2B and lin-28 or cpg-2 3′UTRs with unaltered or mutated versions of the GLD-1 binding motif were introduced into C. elegans. (b,c,e,f) Mutation of the GLD-1 binding motif leads to de-repression of reporter constructs in the meiotic region of the germline where GLD-1 is expressed.
Figure 7
Figure 7. GLD-1 binds close to the start codon
(a) Examples of 5′UTR binding sites with highly conserved GLD-1 binding motifs directly upstream of start codons. The number of T conversions is color-coded. Nucleotides surrounding binding sites are not conserved. (b) Gelshift assays demonstrating binding of the GLD-1 RNA binding domain (RBD) to 5′UTR target sites in vitro, depending on the GLD-1 binding motif. For direct titration experiments, radiolabeled RNA was incubated with increasing concentrations of GLD-1. For competition binding assays, unlabeled competitor RNA was titrated into labeled GLD-1 RNA complexes (Supplemental Methods). Mutant competitor RNAs did not outcompete wildtype RNAs in the tested range of concentrations, indicating that their affinity is at least one order of magnitude lower compared to wildtype sequences. (c–e) Models for GLD-1 dimer dependent translational regulation via the 5′UTR (see discussion).

Similar articles

Cited by

References

    1. Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proceedings / … International Conference on Intelligent Systems for Molecular Biology; ISMB.1994. pp. 28–36. - PubMed
    1. Beanan MJ, Strome S. Characterization of a germ-line proliferation mutation in C. elegans. Development (Cambridge, England) 1992;116:755–766. - PubMed
    1. Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature biotechnology. 2008;26:1367–1372. - PubMed
    1. Didiano D, Hobert O. Perfect seed pairing is not a generally reliable predictor for miRNA-target interactions. Nature structural & molecular biology. 2006;13:849–851. - PubMed
    1. Dolken L, Ruzsics Z, Radle B, Friedel CC, Zimmer R, Mages J, Hoffmann R, Dickinson P, Forster T, Ghazal P, Koszinowski UH. High-resolution gene expression profiling for simultaneous kinetic parameter analysis of RNA synthesis and decay. RNA (New York, N.Y. 2008;14:1959–1972. - PMC - PubMed

Publication types

Associated data