Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010;11(6):R65.
doi: 10.1186/gb-2010-11-6-r65. Epub 2010 Jun 23.

Detection and analysis of alternative splicing in Yarrowia lipolytica reveal structural constraints facilitating nonsense-mediated decay of intron-retaining transcripts

Affiliations

Detection and analysis of alternative splicing in Yarrowia lipolytica reveal structural constraints facilitating nonsense-mediated decay of intron-retaining transcripts

Meryem Mekouar et al. Genome Biol. 2010.

Abstract

Background: Hemiascomycetous yeasts have intron-poor genomes with very few cases of alternative splicing. Most of the reported examples result from intron retention in Saccharomyces cerevisiae and some have been shown to be functionally significant. Here we used transcriptome-wide approaches to evaluate the mechanisms underlying the generation of alternative transcripts in Yarrowia lipolytica, a yeast highly divergent from S. cerevisiae.

Results: Experimental investigation of Y. lipolytica gene models identified several cases of alternative splicing, mostly generated by intron retention, principally affecting the first intron of the gene. The retention of introns almost invariably creates a premature termination codon, as a direct consequence of the structure of intron boundaries. An analysis of Y. lipolytica introns revealed that introns of multiples of three nucleotides in length, particularly those without stop codons, were underrepresented. In other organisms, premature termination codon-containing transcripts are targeted for degradation by the nonsense-mediated mRNA decay (NMD) machinery. In Y. lipolytica, homologs of S. cerevisiae UPF1 and UPF2 genes were identified, but not UPF3. The inactivation of Y. lipolytica UPF1 and UPF2 resulted in the accumulation of unspliced transcripts of a test set of genes.

Conclusions: Y. lipolytica is the hemiascomycete with the most intron-rich genome sequenced to date, and it has several unusual genes with large introns or alternative transcription start sites, or introns in the 5' UTR. Our results suggest Y. lipolytica intron structure is subject to significant constraints, leading to the under-representation of stop-free introns. Consequently, intron-containing transcripts are degraded by a functional NMD pathway.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Characteristics of Y. lipolytica introns. (a) Size distribution of the 1,083 introns from strain E150 located within the coding regions of genes. Introns are separated into three size classes: multiples of 3 nucleotides (blue line), multiples of 3 plus 1 nucleotides (orange line), and multiples of 3 plus 2 nucleotides (green line). For each class, the number of introns is reported as a function of size, with a window of 20 nucleotides from 41 nucleotides to more than 1,000 nucleotides. (b) Position of introns within the CDS. Introns are separated according to their order in the gene model, from start to stop: first introns of genes (red boxes), second introns of genes (orange boxes) and other introns (green boxes). Data for all introns considered together are shown in black. The proportion of introns in each group is plotted as a function of their relative position within the CDS, with a window of 10% of the CDS length.
Figure 2
Figure 2
Distribution of introns as a function of their length and insertion frame. (a) Introns are represented according to the three possible frames of the CDS. Phase 0 indicates that the intron is located between two codons, phase 1 indicates that it is located after the first nucleotide of a codon and phase 2 indicates that it is located after the second nucleotide of a codon. 'All introns' corresponds to the 1,083 introns, 'first introns' to the first intron of the 951 intron-containing genes and 'other introns' to the 131 second, third, fourth and fifth introns of genes. Differences between insertion phases were statistically significant for all introns (c2 = 64.68, P = 8.98e-15) or for the first introns (c2 = 60.68, P = 6.63e-14) but not for introns other than the first intron (c2 = 5.50, P = 0.063), probably due to their limited number. (b) The proportions of each of the four bases are represented for each base of the codons of the 6,449 protein-coding genes. Differences in nucleotide distribution were statistically significant for each position within the codon (c2 test, P << e-100). Stop codons were not considered. (c) Introns shown according to length categories, corresponding to a multiple of 3 (3n) or a multiple of 3 plus 1 nucleotides (3n + 1) or plus 2 nucleotides (3n + 2). There were 204 introns ≤60 nucleotides in length. The underrepresentation of 3n introns was statistically significant for all introns (c2 = 7.35, P = 0.025), first introns (c2 = 10.90, P = 0.004) and for introns no longer than 60 nucleotides (c2 = 6.70, P = 0.034). (d) Stop-free introns are represented according to their insertion frame and length category.
Figure 3
Figure 3
Presence of premature termination codons in spliceosomal introns, as a function of intron size (3n, 3n + 1, 3n + 2) and insertion frame (frame 0, 1 and 2) within the coding sequence. (a) A PTC is generated for all retained introns inserted in frame 2 and containing GTGAGT or GTAAGT as the 5'ss sequence, whatever their length; 209 introns are concerned, that is, 19.3% of all intron-containing genes. (b) PTCs (TAA) are also detected in the BP of 3n + 2 introns in frame 0, 3n + 1 introns in frame 1 or 3n introns in frame 2 if the S2 distance is indeed 1 bp. (c) The main 3'ss is CAG, but, in about 10.5% of the introns, TAG is also used. This sequence generates a PTC for 3n introns inserted in frame 0, 3n + 2 introns in frame 1 and 3n + 1 introns in frame 2. Overall, conserved intron motifs are present in about 50% of the PTC-containing introns.
Figure 4
Figure 4
Schematic representation of alternative transcripts from multi-intronic genes. Gene models include exons, represented by gray rectangles and introns, symbolized by thin black articulated lines. Vertical bars on each of the three phases (0, +1 and +2) represent an in-frame stop codon. The resulting mRNA variants are depicted as a concatenation of exons and the thick black vertical line represents the first in-frame codon of the transcript. The size of the putative proteins derived from each splicing variant is indicated on the right. All three genes generate at least three different splicing variants. (a) YALI0C23496g mRNAs are subject to intron retention (intron 1) or exon skipping (exon 2). The retention of intron 1 generates a PTC and a putative peptide of 11 amino acids. Exon 2 skipping generates a frameshift in exon 3 and in exon 4, which is slightly shortened (exon 4'), and generates a putative protein of 65 amino acids. (b) YALI0F26873g splicing variants display retained intron 1, alternative 3'ss (intron 2) usage or the skipping of exon 3. Both variants with a retained intron 1 generate a PTC in exon 2 and a putative truncated protein of 259 amino acids. (c) In YALI0F32043g mRNAs, the retention of intron 5 and the use of an alternative 3'ss do not generate a PTC or a frame shift in that intron 5 is a multiple of three (60 nucleotides) nucleotides long and the difference between E4 and E4' is also a multiple of three (15 nucleotides). Both variants generate a putative protein of about the same size as that generated by the fully spliced transcript. Considering the large size of exon 6, it is shown truncated with horizontal dashed lines.
Figure 5
Figure 5
Schematic diagram of alternative variants of YALI0D18403g. The two different transcription start sites (TSS1 and TSS2) are indicated by arrows. (a) TSS2 is located 179 bases upstream of the methionine initiation codon of YALI0D18403g1 (position 2309045 on chromosome D) downstream of YALI0D18436g and allows the transcription of a single exon. Translation of this mRNA generates a putative protein of 1,322 amino acids. (b) TSS1 is located about 3 kb upstream of TSS2 and initiates a transcript with a 3,478-nucleotide intron. Surprisingly, this intron overlaps YALI0D18436g, a CDS of 1,062 bases the translation of which generates a putative 353 amino acid protein of unknown function. Translation of the YALI0D18403g2 mRNAs generates a putative protein of 1,424 amino acids.
Figure 6
Figure 6
Alternative splicing in YALI0B15598g and conservation of gene models in Dikarya species. (a) Gene models for YALI0B15598g. Exons are represented by gray or black (skipped exon) rectangles and introns by thin black lines. The size of the putative protein is 502 amino acids when intron 1 and intron 2 are efficiently spliced, or 489 amino acids when exon 2 is skipped. (b) Amino acid alignment of the amino-terminal domain of fungal and yeast proteins, homologs of YALI0B15598g. The size of this domain is given in amino acids, on the right, for each protein (from 20 to 41). The black rectangle groups together hemiascomycetous yeasts or ascomycetous filamentous fungi. Archiascomycetes are represented by S. pombe and basidiomycetes by Ustilago maydis. The numbers of spliced introns (column on the right) are colored identically when intron positions are conserved within genes: blue for most hemiascomycetous yeasts, red for Y. lipolytica, green for all ascomycetous filamentous fungi, yellow for S. pombe and black for U. maydis. (c) Intron localization. Triangles indicate the position of the introns for the different groups of genes (same colors as in (b)). Only intron 4 of Y. lipolytica is conserved in all genes.
Figure 7
Figure 7
Gene expression in the NMD- context. (a) Variations in the level of expression of YALI0C23496g splicing variants as a function of NMD context. RT-PCR products from spliced (S) and unspliced transcripts (intron 1 retained, R) from wild-type strains (WT) and NMD mutants (NMD-). Wild-type strains are E150 (lane 1) and PO1d (lane 2). NMD- strains are two independent knockouts of UPF1 (lane 3, upf1::LEU2 clone 7; lane 4, upf1::LEU2 clone C) and one UPF2 knockout (lane 5: upf2::LEU2 clone 7). The intensity of the unspliced transcripts is much stronger in the mutant strains. (b) Expression of the different transcripts of the Y. lipolytica YRA1 gene. Northern blot of total RNA of wild-type (WT) strain PO1d (lane 1) and NMD- mutant strains upf1::LEU2 clone 7 (lane 2), upf2::LEU2 clone 7 (lane 3), upf1::URA3 upf2::LEU2 (lane 4), xrn1::LEU2 (lane 5). The exon probe binding to exons 1 and 3 reveals the spliced transcript (S) in all strains and an additional splicing variant in NMD- mutants only. This variant corresponds to the retention of intron 1 (R). Hybridization with intron 1 confirmed that this intron is retained only in NMD- mutants, whereas it is efficiently spliced out in PO1d and xrn1- mutants.

Similar articles

Cited by

References

    1. Dujon B, Sherman D, Fischer G, Durrens P, Casaregola S, Lafontaine I, De Montigny J, Marck C, Neuvéglise C, Talla E, Goffard N, Frangeul L, Aigle M, Anthouard V, Babour A, Barbe V, Barnay S, Blanchin S, Beckerich JM, Beyne E, Bleykasten C, Boisramé A, Boyer J, Cattolico L, Confanioleri F, De Daruvar A, Despons L, Fabre E, Fairhead C, Ferry-Dumazet H. Genome evolution in yeasts. Nature. 2004;430:35–44. doi: 10.1038/nature02579. - DOI - PubMed
    1. Souciet JL, Dujon B, Gaillardin C, Johnston M, Baret PV, Cliften P, Sherman DJ, Weissenbach J, Westhof E, Wincker P, Jubin C, Poulain J, Barbe V, Ségurens B, Artiguenave F, Anthouard V, Vacherie B, Val ME, Fulton RS, Minx P, Wilson R, Durrens P, Jean G, Marck C, Martin T, Nikolski M, Rolland T, Seret ML, Casarégola S, Despons L. Comparative genomics of protoploid Saccharomycetaceae. Genome Res. 2009;19:1696–1709. doi: 10.1101/gr.091546.109. - DOI - PMC - PubMed
    1. Dietrich FS, Voegeli S, Brachat S, Lerch A, Gates K, Steiner S, Mohr C, Pöhlmann R, Luedi P, Choi S, Wing RA, Flavier A, Gaffney TD, Philippsen P. The Ashbya gossypii genome as a tool for mapping the ancient Saccharomyces cerevisiae genome. Science. 2004;304:304–307. doi: 10.1126/science.1095781. - DOI - PubMed
    1. Vernis L, Poljak L, Chasles M, Uchida K, Casarégola S, Käs E, Matsuoka M, Gaillardin C, Fournier P. Only centromeres can supply the partition system required for ARS function in the yeast Yarrowia lipolytica. J Mol Biol. 2001;305:203–217. doi: 10.1006/jmbi.2000.4300. - DOI - PubMed
    1. Marck C, Kachouri-Lafond R, Lafontaine I, Westhof E, Dujon B, Grosjean H. The RNA polymerase III-dependent family of genes in hemiascomycetes: comparative RNomics, decoding strategies, transcription and evolutionary implications. Nucleic Acids Res. 2006;34:1816–1835. doi: 10.1093/nar/gkl085. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources