Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jan;5(1):e1000321.
doi: 10.1371/journal.pgen.1000321. Epub 2009 Jan 2.

Non-coding RNA prediction and verification in Saccharomyces cerevisiae

Affiliations

Non-coding RNA prediction and verification in Saccharomyces cerevisiae

Laura A Kavanaugh et al. PLoS Genet. 2009 Jan.

Abstract

Non-coding RNA (ncRNA) play an important and varied role in cellular function. A significant amount of research has been devoted to computational prediction of these genes from genomic sequence, but the ability to do so has remained elusive due to a lack of apparent genomic features. In this work, thermodynamic stability of ncRNA structural elements, as summarized in a Z-score, is used to predict ncRNA in the yeast Saccharomyces cerevisiae. This analysis was coupled with comparative genomics to search for ncRNA genes on chromosome six of S. cerevisiae and S. bayanus. Sets of positive and negative control genes were evaluated to determine the efficacy of thermodynamic stability for discriminating ncRNA from background sequence. The effect of window sizes and step sizes on the sensitivity of ncRNA identification was also explored. Non-coding RNA gene candidates, common to both S. cerevisiae and S. bayanus, were verified using northern blot analysis, rapid amplification of cDNA ends (RACE), and publicly available cDNA library data. Four ncRNA transcripts are well supported by experimental data (RUF10, RUF11, RUF12, RUF13), while one additional putative ncRNA transcript is well supported but the data are not entirely conclusive. Six candidates appear to be structural elements in 5' or 3' untranslated regions of annotated protein-coding genes. This work shows that thermodynamic stability, coupled with comparative genomics, can be used to predict ncRNA with significant structural elements.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Z-score vs. position.
The tRNA (K00228.1), length 82 nt, is embedded in mRNA sequence (AF452886, 22–270 nt) at position 170–246 (represented as a black box). The Z-score for the sliding window (step size = 1) is plotted vs. position. The Z-score value is placed in the center of the window. Three different window lengths (black-60 nt; blue-82 nt; red-95 nt) are plotted. The blue plot is a scan using the exact tRNA length (82 nt) as the window size. This tRNA was detected using window lengths as short as 60 nt and as long as 95 nt.
Figure 2
Figure 2. Z-score vs. position.
The tRNA (AF076356.1), length 69 nt, is embedded in mRNA sequence (NM_001003966, 1–366 nt) at position 117–185 (represented as a black box). The Z-score for the sliding window (step size = 1) is plotted vs. position. The Z-score value is placed in the center of the window. Three different window lengths (black-60 nt; blue-69 nt; red-79 nt) are plotted. The blue plot is a scan using the exact tRNA length (69 nt) as the window size. This tRNA was not detected using window length of 60 nt and detected only by a single point using a window length of 79 nt.
Figure 3
Figure 3. Schematic of ncRNA candidates.
The genes annotated in SGD are represented as open boxes containing the name of the gene. Position numbers above the genes on chromosome VI are taken from SGD. Dotted lines extending from the boxes represent UTR regions and numbers above the lines indicate the measured length of the UTR. The curved vertical lines signify that the entire length of the flanking genes is not included in the figure. The ncRNAs for which complete RACE data are available are shown as black boxes, and the candidates for which there is incomplete RACE data are shown as gray or black-to-gray gradient boxes. (A) RUF20 between SEC4 and VTC2 (B) RUF21 between TUB2 and RPO41 (C) RUF22 between ROG3 and PES4 (D) RUF23 between RPL2A and YFR032C.
Figure 4
Figure 4. Schematic of RUF20 in S. cerevisiae, S. bayanus, and A. gossypii.
Open boxes represent the flanking genes, SEC4 and VTC2. The transcripts for which complete RACE data are available are shown as black boxes, and the candidates for which there is incomplete RACE data are shown as blank or black-to-gray gradient boxes. The coordinates for the bounds of the genes are noted in S. cerevisiae. The curved vertical lines signify that the entire length of the flanking genes is not included in the figure. (A) RUF20 in S. cerevisiae. (B) RUF20 in S. bayanus. (C) RUF20 in A. gossypii.

Similar articles

Cited by

References

    1. Eddy SR. Non-coding RNA genes and the modern RNA world. Nat Rev Genet. 2001;2:919–929. - PubMed
    1. Storz G. An expanding universe of noncoding RNAs. Science. 2002;296:1260–1263. - PubMed
    1. Mattick JS, Makunin IV. Non-coding RNA. Hum Mol Genet. 2006;15 Spec No 1:R17–R29. - PubMed
    1. Costa FF. Non-coding RNAs: lost in translation? Gene. 2007;386:1–10. - PubMed
    1. Bertone P, Stolc V, Royce TE, Rozowsky JS, Urban AE, et al. Global identification of human transcribed sequences with genome tiling arrays. Science. 2004;306:2242–2246. - PubMed

Publication types

MeSH terms

LinkOut - more resources