Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Feb 3;82(3):645-659.e9.
doi: 10.1016/j.molcel.2021.12.023. Epub 2022 Jan 19.

Pseudouridine synthases modify human pre-mRNA co-transcriptionally and affect pre-mRNA processing

Affiliations

Pseudouridine synthases modify human pre-mRNA co-transcriptionally and affect pre-mRNA processing

Nicole M Martinez et al. Mol Cell. .

Abstract

Pseudouridine is a modified nucleotide that is prevalent in human mRNAs and is dynamically regulated. Here, we investigate when in their life cycle mRNAs become pseudouridylated to illuminate the potential regulatory functions of endogenous mRNA pseudouridylation. Using single-nucleotide resolution pseudouridine profiling on chromatin-associated RNA from human cells, we identified pseudouridines in nascent pre-mRNA at locations associated with alternatively spliced regions, enriched near splice sites, and overlapping hundreds of binding sites for RNA-binding proteins. In vitro splicing assays establish a direct effect of individual endogenous pre-mRNA pseudouridines on splicing efficiency. We validate hundreds of pre-mRNA sites as direct targets of distinct pseudouridine synthases and show that PUS1, PUS7, and RPUSD4-three pre-mRNA-modifying pseudouridine synthases with tissue-specific expression-control widespread changes in alternative pre-mRNA splicing and 3' end processing. Our results establish a vast potential for cotranscriptional pre-mRNA pseudouridylation to regulate human gene expression via alternative pre-mRNA processing.

Keywords: RNA modification; alternative cleavage and polyadenylation; alternative splicing; cotranscriptional; epitranscriptome; mRNA modification; pre-mRNA processing; pseudouridine; pseudouridine synthase.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests G.W.Y. is the cofounder of, a member of the Board of Directors of, on the scientific advisory board of, an equity holder in, and a paid consultant for Locanabio and Eclipse BioInnovations. G.W.Y. is a visiting professor at the National University of Singapore. G.W.Y.’s interests have been reviewed and approved by the University of California, San Diego, in accordance with its conflict of interest policies. The authors declare no other competing financial interests.

Figures

Figure 1.
Figure 1.. Pre-mRNA is pseudouridylated co-transcriptionally in human cells.
a) Left panel. Western blot of HepG2 cellular fractions, equal cell volumes were loaded and probed with antibodies against GAPDH (cytoplasm), U1-70K (nucleoplasm) and Histone3 (chromatin). Right panel. Distribution of pre-mRNA reads mapping to introns versus exons in the chromatin-associated RNA fraction. Bottom panel. Genome browser view of reads per million mapping across the highly expressed gene hnRNPA2B1 in a chromatin-associated RNA library compared to a poly(A)+ mRNA library from HepG2 cells. b) Detection of pseudouridine by Pseudo-seq with a representative genome browser view of Pseudo-seq reads mapping to RBM39, red dotted line indicates the location of the pseudouridine (chr20:34297199) identified by a CMC-dependent reverse transcriptase stop one nucleotide 3′ to the site. Reads per million (RPM). c) Pseudo-seq signal, equal to the difference in normalized reads between the +CMC and mock libraries. Traces for 11 biological replicates of chromatin-associated RPL7A pre-mRNA pseudouridine (chr9:136217792) are shown, d) ROC curve of true positive versus false positive rates of known pseudouridine locations in mature human rRNA for two representative chromatin-associated RNA replicates. The Pseudo-seq signal is displayed for both replicates, e) Representative genome browser view of Pseudo-seq reads mapping to HSPD1 from HepG2 chromatin-associated RNA and HeLa mRNA red dotted line indicates the location of this cell type conserved pseudouridine identified by a CMC-dependent reverse transcriptase stop one nucleotide 3′ to the site. f) Top panel. Summary of pseudouridines sites identified in chromatin associated RNAs. Bottom panel. Distribution of background uridines meeting the minimum read cutoff for site calling.
Figure 2.
Figure 2.. Pseudouridines are enriched around splicing regulatory features and directly affect splicing
a) Schematic of a spliced exon including the core signals for splice site recognition: branch point region (BP), polypyrimidine tract (PPT) and 5′ and 3′ splice site (ss). Number of pseudouridines identified in each splice site region is summarized below the schematic. Pseudouridines are enriched in proximal introns (within 500nt) of splice sites, p-value = 1.3e−05 from Fisher’s exact test and within splice sites (within 6 nt from intron ends), p-value = 0.55e-06 from Fisher’s exact test. b) Distribution of pseudouridines (filled bars) versus uridines with adequate read coverage in the introns (line bars) of annotated alternatively spliced regions. P-value = 2.2e−16 denotes a significant change in the distribution of pseudouridine relative to the uridine distribution as determined by chi-squared test for the overall change in proportions across regions. Alternative splice sites (Alt SS). c) Representative RT-PCR gel of in vitro splicing of RBM39 two-exon reporter (Supplemental Figure 2a,c) that was either unmodified or site specifically modified with pseudouridine (−/+ Ψ) in splicing competent wildtype Jurkat nuclear extract. d) ) Quantification of % spliced of in vitro splicing of the RBM39 reporter in a splicing timecourse (30, 60 and 90 minutes) in Jurkat nuclear extract. Data is displayed as a stripchart with box plot, where the dots represent the value for each sample for a given condition ((30 (n=2), 60 (n=3) and 90 minutes (n=3)). P-values were calculated by a paired t-test and difference considered significant if p-value < 0.05. An asterisk denotes significance. e) Quantification of in vitro splicing of the RBM39 reporter that was either unmodified or site specifically modified with pseudouridine (−/+ Ψ) in Jurkat and HeLa nuclear extract. Quantification of % spliced from n=3 is displayed as a stripchart with box plot, where the dots represent the value for each sample for a given cell type. P-values were calculated by a paired t-test and difference considered significant if p-value < 0.05. An asterisk denotes significance.
Figure 3.
Figure 3.. Pseudouridines are enriched in RNA binding protein binding sites.
a) Genome browser views of U2AF2 and b) SF3A3 eCLIP peaks and size-matched input controls (SMI) on IDH1 and ITIH3 respectively. The location of pseudouridine relative to the eCLIP peak is denoted by (Ψ). c) Volcano plots of pseudouridines overlapping RBP eCLIP peaks displaying the fold enrichment (IP over size matched input) versus the SMI-normalized adjusted p-value (Van Nostrand et al. 2016 Nat Methods). The overlap between eCLIP peaks and pseudouridines is shown for two RBPs U2AF2 and SF3A3. Proximal introns refer to intronic sequences <500 nt from splice sites and distal introns refer to intronic sequences >500 nt from splice sites. d) Z-scores were generated by comparing the fraction of eCLIP peaks overlapping pseudouridines to the calculated overlap after shuffling pseudouridines within intronic regions 1000 times. e) The z score of pseudouridines shuffled 1000 times within intronic regions plotted against the z score of uridines shuffled 1000 times within intronic regions.
Figure 4.
Figure 4.. Multiple PUS pseudouridylate pre-mRNA sequences.
a) Schematic of in vitro pseudouridylation assay with RNA made from a pool of 6000 oligos containing all the sites identified in HepG2 chromatin-associated RNA. In vitro pseudouridylation was carried by incubating pool RNA with recombinant human pseudouridine synthases (PUS) and pseudouridines were identified by Pseudo-seq. b) Genome browser view of Pseudo-seq reads at pseudouridine sites following RNA incubation with a recombinant PUS or no PUS control. Plots for three intronic pre-mRNA pseudouridines: a PUS1 target GPC3, PUS7 target in RBM39 and a TRUB1 target in NOMO2. c) Combined distribution of pseudouridines validated as direct targets of all tested PUS by in vitro Pseduo-seq assay. d) Summary of pseudouridines assigned as direct targets of each PUS protein from in vitro Pseudo-seq assay. e) Weblogo summarizing frequency of motifs identified among targets of PUS7 and TRUB1.
Figure 5.
Figure 5.. Pseudouridine synthases regulate alternative splicing.
a) Left - Western blot of the CRISPR knockout PUS1 HepG2 cell line probed for PUS1 and a loading control. RNA was isolated from PUS1 knockout (KO) and wild type (WT) cells and mRNA-seq libraries were prepared from poly(A)+ mRNA. The number of significant alternative splicing changes in PUS1 KO versus WT (n=2 biological replicates) is displayed by type of alternative splicing: cassette exons (cassette), alternative 3′ splice sites (alt 3′ss), alternative 5′ splice sites (alt 5′ ss), retained introns (RI) and mutually exclusive exons (ME). Significant alternative splicing events were determined from rMATS as those events that changed by greater than 10% difference in percent inclusion and a false discovery rate (FDR) of less than or equal to 0.05. Middle - Western blot of representative RPUSD4 knockdown (~60%) at 96h following shRNA induction. RPUSD4-sensitive alternative splicing changes determined from RNA-seq analysis (n=2 biological replicates) as above. Right - Western blot of representative PUS7 knockdown (~90%) at 96h following shRNA induction. PUS7-sensitive alternative splicing changes determined from RNA-seq analysis (n=3 biological replicates) as above. b) Left - schematic of a cassette exon in PUM2 and location of pseudouridine. Right - quantification of exon inclusion in WT and PUS1 KO based on junction spanning reads from RNA-seq. Asterisk denotes statistical significance based on p-value < 0.05 as calculated by rMATS. c) Genome browser view of Pseudo-seq reads of the intronic PUM2 pseudouridine site (Figure 5b) following pseudouridylation with recombinant PUS1 or in the absence of PUS. d) Left - schematic of a cassette exon in NAP1L4 and location of pseudouridine. Right - quantification of exon inclusion in WT and PUS7 KD based on junction spanning reads from RNA-seq. Asterisk denotes statistical significance based on p-value < 0.05 as calculated by rMATS. e) Genome browser view of Pseudo-seq reads of the intronic NAP1L4 pseudouridine site (Figure 5d) following pseudouridylation with recombinant PUS7 or in the absence of PUS. f) Scatter plot showing pairwise comparisons of Z-score values at candidate pseudouridine sites incubated with recombinant PUS1 versus PUS7. g) Venn diagram of overlap among cassette exons regulated by PUS1, PUS7 and RPUSD4.
Figure 6.
Figure 6.. Pseudouridine synthases regulate 3′ end processing.
a) Genome browser view of a PUS1-dependent alternative cleavage and polyadenylation (APA) in the 3′ UTR of SRSF6. Upon PUS1 KO there is a shift toward usage of a proximal polyA site (PAS1) and away from the diatal polyA site (PAS2) resulting in expression of a shorter 3′ UTR isoform. The location of a two pseudouridine in this 3′ UTR upstream of each PAS is indicated. b) The number of significant alternative cleavage and polyadenylation events in PUS1 KO (n=2 biological replicates), RPUSD4 KD (n=2 biological replicates) or PUS7 KD (n=3 biological replicates) compared to WT is displayed by type of APA: 3′ UTR shortening and 3′ UTR lengthening. Significant APA events were determined from QAPA as those events that changed by greater than 10% difference in polyA site usage and were reproducible across replicates. c) Genome browser views of CSTF2T eCLIP peak and size-matched input controls (SMI) in the 3′ UTR of HMGS1. The location of a pseudouridine relative to the eCLIP peak is denoted by (Ψ).

Comment in

  • Pseudouridylation alters splicing.
    Zlotorynski E. Zlotorynski E. Nat Rev Mol Cell Biol. 2022 Mar;23(3):167. doi: 10.1038/s41580-022-00458-x. Nat Rev Mol Cell Biol. 2022. PMID: 35087241 No abstract available.

Dataset use reported in

References

    1. Attig J et al. (2018) ‘Heteromeric RNP Assembly at LINEs Controls Lineage-Specific RNA Processing’, Cell. doi: 10.1016/j.cell.2018.07.001. - DOI - PMC - PubMed
    1. Bailey TL et al. (2015) ‘The MEME Suite’, Nucleic Acids Research, 43(W1), pp. W39–W49. doi: 10.1093/nar/gkv416. - DOI - PMC - PubMed
    1. Barrett JC et al. (2008) ‘Genome-wide association defines more than 30 distinct susceptibility loci for Crohn’s disease’, Nature Genetics, 40(8), pp. 955–962. doi: 10.1038/ng.175. - DOI - PMC - PubMed
    1. Bhatt DM et al. (2012) ‘Transcript dynamics of proinflammatory genes revealed by sequence analysis of subcellular RNA fractions’, Cell, 150(2), pp. 279–290. doi: 10.1016/j.cell.2012.05.043. - DOI - PMC - PubMed
    1. Birkedal U et al. (2015) ‘Profiling of ribose methylations in RNA by high-throughput sequencing’, Angewandte Chemie - International Edition. doi: 10.1002/anie.201408362. - DOI - PubMed

Publication types

MeSH terms