Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jul 18;12(7):875.
doi: 10.15252/msb.20166941.

Pervasive isoform-specific translational regulation via alternative transcription start sites in mammals

Affiliations

Pervasive isoform-specific translational regulation via alternative transcription start sites in mammals

Xi Wang et al. Mol Syst Biol. .

Abstract

Transcription initiated at alternative sites can produce mRNA isoforms with different 5'UTRs, which are potentially subjected to differential translational regulation. However, the prevalence of such isoform-specific translational control across mammalian genomes is currently unknown. By combining polysome profiling with high-throughput mRNA 5' end sequencing, we directly measured the translational status of mRNA isoforms with distinct start sites. Among 9,951 genes expressed in mouse fibroblasts, we identified 4,153 showed significant initiation at multiple sites, of which 745 genes exhibited significant isoform-divergent translation. Systematic analyses of the isoform-specific translation revealed that isoforms with longer 5'UTRs tended to translate less efficiently. Further investigation of cis-elements within 5'UTRs not only provided novel insights into the regulation by known sequence features, but also led to the discovery of novel regulatory sequence motifs. Quantitative models integrating all these features explained over half of the variance in the observed isoform-divergent translation. Overall, our study demonstrated the extensive translational regulation by usage of alternative transcription start sites and offered comprehensive understanding of translational regulation by diverse sequence features embedded in 5'UTRs.

Keywords: alternative transcription start sites; cis‐regulatory elements; isoform‐divergent translation; translational regulation.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Experimental scheme, TSS discovery, and examples of isoform‐specific translational efficiency (TE)
  1. Experimental scheme. RNAs were collected from seven gradient fractions and the 5ʹ ends of RNA transcripts were quantitatively profiled in each fraction using an adapted cap‐trapping approach.

  2. Pie chart showing the distribution of TSSs identified in this study in different regions of protein‐coding genes. The majority of TSSs were derived from gross 5ʹUTRs, including annotated 5ʹUTRs and 1 kb upstream of the annotated TSSs (Up‐1 kb).

  3. Pie chart showing the number of TSSs in the gross 5ʹUTRs per protein‐coding gene. Out of the 9,951 genes with at least one TSS detected, 4,153 (41.7%) expressed multiple TSSs.

  4. Two examples were shown to demonstrate the impact of alternative TSSs on TE. Cumulative reads along each gene from the seven gradient fractions (shown in the middle) were plotted under the gene structure. While the two alternative TSSs from gene Nfkb2 resulted in no difference in TE, the two from gene Cnot1 led to substantial TE difference. Please note the range of read coverage varied across fractions. Red and blue bars represented sequencing reads mapped within distal and proximal TSSs, respectively; gray bars represented reads mapped outside of the identified TSSs. The description of the two genes can be found in Table EV3.

Figure EV1
Figure EV1. TSS discovery and quantification across different gradient fractions detected in this study
  1. The scatter plot comparing read counts of each TSS cluster between biological replicates for each of the seven polysome fractions.

  2. Hierarchical clustering of TSS isoform abundance across all fractions and replicates. Each row represented one TSS isoform, and each column represented different fractions. Z‐scores, showing in different colors, represented the normalized isoform abundance across fractions.

Figure EV2
Figure EV2. Polysome profiling measured mRNA translational efficiency
  1. For each TSS isoform, the number of ribosomes per mRNA was plotted against its corresponding ORF length.

  2. For each TSS isoform, the abundance ratio between monosome fraction and sum of polysome factions was plotted against its corresponding ORF length. Short ORFs (≤ 450 nt) were more enriched in the monosome fraction.

  3. TE values for each gene calculated based on published ribosome footprinting data were plotted against the TE values calculated based on polysome profiling data in this study.

  4. TE values for each gene calculated based on published proteomics/genomics data were plotted against the TE values calculated based on polysome profiling data in this study.

  5. Log2‐transformed TE fold change values for each pair of alternative TSS isoforms calculated based on all seven fraction data were compared to those calculated based on data with one of the seven fractions left out.

Figure EV3
Figure EV3. GO enrichment analyses and the relationship between transcription and translation
  1. GO enrichment for single/multi‐TSS genes over all expressed genes.

  2. Boxplots showing the distribution of TE divergence between alternative TSS isoforms grouped by their abundance differences. Box edges represent quantiles, whiskers represent extreme data points.

  3. Boxplots showing the distribution of TE at the gene level grouped by mRNA abundance. Box edges represent quantiles, whiskers represent extreme data points no more than 1.5 times the interquartile range.

Figure 2
Figure 2. Alternative TSSs lead to significantly differential TE in 745 out of 4,153 multi‐TSS genes
  1. Scatter plot showing the bootstrap means (x‐axis) and standard deviations (y‐axis) for log2‐transformed TE difference between 13,118 TSS isoform pairs in the 4,153 multi‐TSS genes. Dashed purple lines indicated the Benjamini–Hochberg adjusted P‐value of 0.01, and dashed orange lines indicated the 1.5‐fold divergence. Genes with significant TE divergence (Benjamini–Hochberg adjusted P‐value < 0.01, TE divergence > 1.5‐fold) are depicted in blue. See also Table EV2.

  2. Independent validation of TSS isoforms and their associated translational efficiency in genes Ndufb11, Ube4b, Nedd8, and Ssu72, respectively. Left: Under each gene structure, cumulative reads were shown for the alternative TSSs in the “free” fraction and poly9+ fraction. Green arrows above the gene structure indicate the locations of the reverse PCR primer. Red and blue bars represented sequencing reads mapped within distal and proximal TSSs, respectively; gray bars represented reads mapped outside of the identified TSSs. Right: Agarose gel electrophoresis of amplified products of mRNA 5ʹ ends obtained from non‐ribosomal fraction and polysomal fraction. Positions of the distal TSS isoform and the proximal TSS isoforms are indicated with red and blue arrows, respectively. In the case of gene Ndufb11, the band below the distal TSS (indicated by a yellow arrow) in the gel image was caused by an alternative splicing event, which removed an 88‐nt region for a minor fraction of transcripts initiating at the distal TSS. L, HyperLadder I; N, non‐ribosomal fraction; P, polysomal fraction. The description of these genes can be found in Table EV3.

  3. Alternative 5ʹUTR sequences are able to drive the observed isoform‐specific TE divergence. An in vivo reporter system was used to compare the TE of a Renilla luminescent reporter gene led by the 5ʹUTR sequences derived from eight pairs of alternative TSS isoforms identified in eight genes. TE is calculated by luciferase activity normalized to mRNA abundance. Seven out of eight reporter pairs showed significant differential TE biased toward the same TSS isoforms as observed in our global analysis (n = 3; mean ± SEM; *P < 0.05, **P < 0.01; Student's t‐test). The description of these genes can be found in Table EV3.

Figure EV4
Figure EV4. Quantitative validation of TE divergence between TSS isoforms
The ratio of relative isoform abundance between fractions was calculated according to the formula Tdist,poly/Tprox,polyTdist,nonribo/Tprox,nonribo, where T dist,poly and T prox,poly represented the isoform abundance of distal and proximal TSSs in the polysomal fraction, respectively; T dist,nonribo and T prox,nonribo represented the isoform abundance in the non‐ribosomal fraction. The ratio determined based on agarose gel image (y‐axis) was plotted against that estimated based on 5ʹ end sequencing (x‐axis).
Figure 3
Figure 3. Isoforms with longer 5ʹUTR tend to have lower TE
  1. Barplots showing the fraction of alternative TSS isoform pairs with and without significant differential TE. Isoform pairs with certain 5ʹUTR length difference were grouped together. The larger the length difference between the two isoforms, the higher the fraction associated with significant TE divergence.

  2. Scatter plot comparing the number of ribosomes per mRNA between shorter 5ʹUTR isoforms (x‐axis) and longer 5ʹUTR isoforms (y‐axis) from the same genes. Purple and green dots were isoform pairs with significant differential TE biased toward longer and shorter isoforms, respectively.

Figure 4
Figure 4. Upstream translation started at AUG negatively affects the main ORF translation
  1. Left: Boxplots comparing the log2 TE fold changes between two groups of alternative isoform pairs, one group with at least one uORF present in the isoform‐divergent 5ʹUTR and the other without. Right: The group with uORF was further separated into three subgroups according to the number of uORFs present in the divergent 5ʹUTR.

  2. Same as (A)—left, but the sequence feature of interest is the out‐of‐frame uAUGs.

  3. Same as (A)—left, but the sequence feature of interest is the in‐frame uAUGs.

  4. Same as (A)—left, but the sequence feature of interest is the translated uORFs (i.e. supported by ribosome footprinting) with canonical AUG start codon.

  5. Same as (A)—left, but the sequence feature of interest is the translated out‐of‐frame uAUGs (i.e. supported by ribosome footprinting).

  6. Same as (A)—left, but the sequence feature of interest is the translated uORFs (i.e. supported by ribosome footprinting) with non‐canonical start codons.

  7. Same as (A)—left, but the sequence feature of interest is the translated out‐of‐frame upstream non‐canonical start codons (i.e. supported by ribosome footprinting).

Data information: **P < 0.01, ***P < 0.001; Mann–Whitney U‐test. Box edges represent quantiles, whiskers represent extreme data points no more than 1.5 times the interquartile range.
Figure EV5
Figure EV5. Upstream translation at non‐AUG start codon had no significant impact on main ORF translation
  1. Similar to Fig 4F, but split non‐canonical uORFs into uORFs led by CUGs and uORFs led by GUGs/UUGs.

  2. Similar to Fig 4G, but split out‐of‐frame upstream non‐canonical start codons into out‐of‐frame upstream CUGs and out‐of‐frame upstream GUGs/UUGs.

Data information: Box edges represent quantiles, whiskers represent extreme data points no more than 1.5 times the interquartile range.
Figure 5
Figure 5. Roles of stable RNA structures, 5ʹ TOP sequences, and sequence motifs within 5ʹUTR for translational regulation
  1. Boxplots comparing the log2 TE fold changes between three groups of alternative isoform pairs, the first group with 5ʹ cap‐adjacent (50 nt to 5ʹ ends) stable RNA secondary structures (MFE < −30 kcal/mol) present only in long 5ʹUTR isoforms, the second group with 5ʹ cap‐adjacent stable RNA structure present/absent in both isoforms, and the last group with 5ʹ cap‐adjacent stable RNA structure present only in short 5ʹUTR isoforms.

  2. Boxplots comparing the log2 TE fold changes between two groups of alternative isoform pairs, one group with stable RNA secondary structures (MFE < −35 kcal/mol in any 50‐nt RNA fragments) present in the downstream divergent 5ʹUTR and the other without.

  3. Boxplots comparing the log2 TE fold changes between TOP genes and non‐TOP genes (controls). For TOP genes, the TE fold changes were the ratios between the isoforms with 5ʹ TOP sequences present and isoforms without, and for non‐TOP genes, isoforms were randomly assigned as numerators and denominators.

  4. Left: Boxplots comparing the log2 TE fold changes between two groups of alternative isoform pairs, one group with the motif AAUCCC present in divergent 5ʹUTRs and the other without.

    Right: Luciferase assay comparing the relative TE between reporter genes with five copies of motif AAUCCC, reverse complement of motif AAUCCC, and randomly shuffled sequences in their 5ʹUTRs (n = 3; mean ± SEM; n.s. P > 0.05).

  5. Similar to (D), but the motif is CAAGAU (n = 3; mean ± SEM; *P < 0.05; Student's t‐test).

Data information: In boxplots, *P < 0.05, **P < 0.01, ***P < 0.001; Mann–Whitney U‐test. Box edges represent quantiles, whiskers represent extreme data points no more than 1.5 times the interquartile range.
Figure EV6
Figure EV6. Cap‐adjacent stable RNA structures inhibit translation
  1. A, B

    Two examples showing that cap‐adjacent stable RNA structures repressed translation. The description of the two genes can be found in Table EV3.

  2. C, D

    Same as Fig 5A and B, but based on EFE to define stable RNA structures. ***P < 0.001; Mann–Whitney U‐test. Box edges represent quantiles, whiskers represent extreme data points no more than 1.5 times the interquartile range.

Figure 6
Figure 6. Quantitative model explaining the TE difference between alternative TSS isoforms
  1. Barplots showing the individual and cumulative contribution for sequence features in explaining the TE difference between alternative TSS isoforms. Individual: variance of TE divergence explained by the model with only the sequence feature; Cumulative: variance explained by the model combining the sequence feature and those above; Delta cumulative: additional variance explained by adding the sequence feature to the model. *The value was the contribution of all significant hexamer motifs.

  2. The combinatory nonlinear regression model based on all sequences features investigated in this study explained 57% variance of TE difference between alternative TSS isoforms.

  3. Histogram showing the distribution of model‐explained variance in the 100 times cross‐validation procedure.

Figure EV7
Figure EV7. Performance of the combinatory nonlinear regression model
Same as Fig 6B, but in addition, we marked the six genes that were tested by luciferase reporter assay (Fig 2C) and containing unambiguously determined 5ʹUTR sequences (see Materials and Methods). The TE divergence values estimated based on 5ʹ end sequencing data are shown in cyan, and those based on reporter assay are shown in yellow.
Figure EV8
Figure EV8. Sequence features associated with translational regulation conferred significant impact in genes with single or multiple 3ʹ ends
All the multi‐TSS genes were separated into two groups based on whether only one or more 3ʹ end were identified in the study from Spies et al (2013). Similar to boxplots in Figs 4 and 5, but all comparisons were performed separately for the two groups (columns), one group contained the genes with only one 3ʹ end and the other group contained the genes with more than one 3ʹ end. Sequence features including uORF, out‐of‐frame uAUG, 5ʹ cap RNA structure, and 5ʹ TOP sequence (rows) were analyzed. Box edges represent quantiles, whiskers represent extreme data points no more than 1.5 times the interquartile range.
Figure EV9
Figure EV9. Examples of alternative TSSs for protein N‐terminal changes
See also Table EV5.
  1. Downstream TSSs could lead to N‐terminal truncated proteins. Two examples were shown here. The description of the two genes can be found in Table EV3.

  2. Alternative TSS could also lead to N‐terminal extended proteins. One example was shown here. The description of this gene can be found in Table EV3.

Similar articles

Cited by

References

    1. Angelini C, De Canditiis D, De Feis I (2014) Computational approaches for isoform detection and estimation: good and bad news. BMC Bioinformatics 15: 135 - PMC - PubMed
    1. Arava Y, Wang Y, Storey JD, Liu CL, Brown PO, Herschlag D (2003) Genome‐wide analysis of mRNA translation profiles in Saccharomyces cerevisiae . Proc Natl Acad Sci USA 100: 3889–3894 - PMC - PubMed
    1. Arce L, Yokoyama NN, Waterman ML (2006) Diversity of LEF/TCF action in development and disease. Oncogene 25: 7492–7504 - PubMed
    1. Arribere JA, Gilbert WV (2013) Roles for transcript leaders in translation and mRNA decay revealed by transcript leader sequencing. Genome Res 23: 977–987 - PMC - PubMed
    1. Arrick BA, Lee AL, Grendell RL, Derynck R (1991) Inhibition of translation of transforming growth factor‐beta 3 mRNA by its 5′ untranslated region. Mol Cell Biol 11: 4306–4313 - PMC - PubMed

Publication types