Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2000 Apr 15;28(8):1700-6.
doi: 10.1093/nar/28.8.1700.

Test of intron predictions reveals novel splice sites, alternatively spliced mRNAs and new introns in meiotically regulated genes of yeast

Affiliations

Test of intron predictions reveals novel splice sites, alternatively spliced mRNAs and new introns in meiotically regulated genes of yeast

C A Davis et al. Nucleic Acids Res. .

Abstract

Correct identification of all introns is necessary to discern the protein-coding potential of a eukaryotic genome. The existence of most of the spliceosomal introns predicted in the genome of Saccharomyces cerevisiae remains unsupported by molecular evidence. We tested the intron predictions for 87 introns predicted to be present in non-ribosomal protein genes, more than a third of all known or suspected introns in the yeast genome. Evidence supporting 61 of these predictions was obtained, 20 predicted intron sequences were not spliced and six predictions identified an intron-containing region but failed to specify the correct splice sites, yielding a successful prediction rate of <80%. Alternative splicing has not been previously described for this organism, and we identified two genes (YKL186C/ MTR2 and YML034W) which encode alternatively spliced mRNAs; YKL186C/ MTR2 produces at least five different spliced mRNAs. One gene (YGR225W/ SPO70 ) has an intron whose removal is activated during meiosis under control of the MER1 gene. We found eight new introns, suggesting that numerous introns still remain to be discovered. The results show that correct prediction of introns remains a significant barrier to understanding the structure, function and coding capacity of eukaryotic genomes, even in a supposedly simple system like yeast.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Identification of true splice sites alters predicted proteins. (A) An incorrectly predicted intron required for YBR090C (gray box) uses a 5′ splice site not compatible with YBR090C. In addition, 5′ PCR primer 1 produced no RT–PCR product in combination with the 3′ primer A, suggesting that transcription initiates downstream of primer 1. Thus, YBR090C is questionable and there is an intron in the mRNA leader of YBR089C-A (white box). (B) An incorrectly predicted intron required for the N-terminal segment of YDL189W (gray boxes) uses a 3′ splice site not compatible with the reading frame. Thus, YDL189W is smaller than currently annotated and has an intron in its mRNA leader. (C) An incorrectly predicted intron causes underestimation of the extent of YOL047C. An AAG 3′ splice site is used extending the ORF in the N-terminal direction. (D) An unspecified intron prediction in YKL157W uses splice sites that fuse two ORFs, YKL158W and YKL157W. For all diagrams, protein coding regions and intron predictions that are incorrect are shown in gray, parts of the original ORF annotation that are correct are shown in white and confirmed introns and new protein coding predictions are shown in black.
Figure 2
Figure 2
Newly identified introns extend ORFs. (A) An intron near the 3′ end of YGR225W extends the C-terminus and draws an ORF on the opposite strand into question. (B) An intron near the 3′ end of the annotated YBR186W ORF extends the C-terminus. (C) A second intron upstream of the annotated YGR001C ORF extends the N-terminus by 54 amino acids. (D) An intron upstream of the annotated YLR211C ORF extends the N-terminus by 86 amino acids. For all diagrams, protein coding regions that are incorrect are shown in gray, parts of the original ORF annotation that are correct are shown in white and confirmed introns and new protein coding predictions are shown in black. Numbers above the ORF indicate amino acids added or (in parentheses) deleted relative to the original annotation.
Figure 3
Figure 3
Alternatively spliced mRNAs. (A) At least six forms of mRNA that differ by splicing arise from the YKL186C region. PCR products from genomic DNA (gDNA, lane 1) and RT–PCR products from cDNA (RNA, lane 2) are compared to marker DNA (lane M, 100 bp ladder marker, the fastest migrating band is 100 bp, next fastest is 200 bp, etc.). Positions of the two 5′ splice sites (labeled 1 and 2) and the three 3′ splice sites (labeled a–c) relative to YKL186C are shown on the unspliced RNA (un) to the right of the gel. Different spliced forms of YKL186C mRNA and the migration of the corresponding PCR products are indicated. From top: un, unspliced; 2-a, 5′ splice site 2 joined to 3′ splice site (3′ss) a; 1-a*, 5′ss 1 joined to 3′ss a; 2-b, 5′ss 2 joined to 3′ss b; 2-c, 5′ss 2 joined to 3′ss c; 1-b, 5′ss 1 joined to 3′ss b; 1-c, 5′ss 1 joined to 3′ss c. Shaded boxes indicate additional amino acids encoded at the N-terminus of the YKL186C coding sequence. Each spliced form is identified by sequence of cloned PCR products except for 1-a, as indicated by the asterisk. The numbers above exon segments refer to amino acids encoded, the numbers below the introns refer to nucleotides removed by splicing. (B) Two forms of spliced RNA from the YML034W region. A previously unannotated intron near the 3′ end of YML034W uses two different 5′ splice sites. Top, positions of YML034W and YML033W (white boxes); middle, use of the downstream 5′ss leads to fusion of most of the YML034W (white box) coding sequence to 48 amino acids plus the coding sequence of YML033W (black box); bottom, use of the upstream 5′ss leads to fusion of most of the YML034W coding sequence to a different 48 amino acids (gray box) encoded in a different reading frame.
Figure 4
Figure 4
A meiosis-specific intron in YGR225W/SPO70. (A) Splicing efficiency of the intron during meiosis. An SK1 yeast strain was shifted to sporulation medium and RNA was extracted at the indicated times (hours) and subjected to RT–PCR using primers that span the YGR225W/SPO70 intron. Lane 1, 100 bp ladder as for Figure 3; lanes 2–8, 0–12 h after induction of sporulation. Arrow U, PCR signal derived from unspliced RNA; arrow S, PCR signal derived from spliced RNA. (B) Splicing of the YGR225W/SPO70 intron is activated by MER1. Haploid cells were transformed with a plasmid carrying a segment of YGR225W/SPO70 spanning the intron under control of a strong constitutive promoter, and a second plasmid either containing (lane 2, +) or lacking (lane 3, –) the MER1 gene. RNA was extracted and subjected to RT–PCR. Lane 1, 100 bp ladder and arrows are as for (A). The structure of YGR225W is shown in Figure 2A.

References

    1. Mewes H.W. et al. (1997) Nature, 387 (suppl.), 7–65. - PubMed
    1. Ainscough R. et al. (1998) Science, 282, 2012–2018. - PubMed
    1. Deutsch M. and Long,M. (1999) Nucleic Acids Res., 27, 3219–3228. - PMC - PubMed
    1. Kent W.J. and Zahler,A.M. (2000) Nucleic Acids Res., 28, 91–93. - PMC - PubMed
    1. Ansari-Lari M.A., Oeltjen,J.C., Schwartz,S., Zhang,Z., Muzny,D.M., Lu,J., Gorrell,J.H., Chinault,A.C., Belmont,J.W., Miller,W. and Gibbs,R.A. (1998) Genome Res., 8, 29–40. - PubMed

Publication types

MeSH terms