Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 May 31:12:219.
doi: 10.1186/1471-2105-12-219.

RAG: an update to the RNA-As-Graphs resource

Affiliations

RAG: an update to the RNA-As-Graphs resource

Joseph A Izzo et al. BMC Bioinformatics. .

Abstract

Background: In 2004, we presented a web resource for stimulating the search for novel RNAs, RNA-As-Graphs (RAG), which classified, catalogued, and predicted RNA secondary structure motifs using clustering and build-up approaches. With the increased availability of secondary structures in recent years, we update the RAG resource and provide various improvements for analyzing RNA structures.

Description: Our RAG update includes a new supervised clustering algorithm that can suggest RNA motifs that may be "RNA-like". We use this utility to describe RNA motifs as three classes: existing, RNA-like, and non-RNA-like. This produces 126 tree and 16,658 dual graphs as candidate RNA-like topologies using the supervised clustering algorithm with existing RNAs serving as the training data. A comparison of this clustering approach to an earlier method shows considerable improvements. Additional RAG features include greatly expanded search capabilities, an interface to better utilize the benefits of relational database, and improvements to several of the utilities such as directed/labeled graphs and a subgraph search program.

Conclusions: The RAG updates presented here augment the database's intended function - stimulating the search for novel RNA functionality - by classifying available motifs, suggesting new motifs for design, and allowing for more specific searches for specific topologies. The updated RAG web resource offers users a graph-based tool for exploring available RNA motifs and suggesting new RNAs for design.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A depiction of how to represent RNA secondary structures as both dual and tree graphs. All non-pseudoknot structures (the first and second rows) can be translated into both dual and tree graphs while a pseudoknot structure (the third row) can only be depicted by a dual graph. Each RNA in the first column exists in nature as a whole (Hammerhead ribozyme) or as a part of domain (mRNA - Secis element and rRNA); their RNA Strand IDs are in parenthesis.
Figure 2
Figure 2
The current number of existing (a) tree graphs and (b) dual graphs in each graph vertex number (from 2 to 10 for tree graphs and from 2 to 9 for dual graphs) in the updated RAG database. Each bar is divided by the topological classifications that were constructed in 2004 (existing, RNA-like, and non-RNA-like) which are represented as red, blue and green, respectively. Since the launch of RAG in 2004, more than twice as many topologies have been identified, most of which have been confirmed from RNA-like topologies (see the second column of Tables 1 and 2 for the distribution of existing motifs in each vertex number from 2 to 10 for tree graphs and from 2 to 9 for dual graphs, respectively).
Figure 3
Figure 3
The current number of tree graphs (a) and dual graphs (b) which are newly confirmed according to method of discovery - experimental (solved) natural structures, comparatively analyzed natural structures, all natural structures, and all RNAs (including both national and synthetic structures) in the updated RAG database. Each number is divided by the topological classifications in 2004 (RNA-like and non-RNA-like) which are represented as blue and green, respectively.
Figure 4
Figure 4
Examples of newly confirmed RNA tree graphs from RNA-like (a) or non-RNA-like (b) graphs classified in the 2004 RAG database. The vertex number/ID (first column) and the second smallest eigenvalue (second column) are shown for each RNA tree graph (third column). RNA secondary structures and their functions are shown in the fourth and fifth columns.
Figure 5
Figure 5
Examples of newly confirmed RNA dual graphs from RNA-like (a) or non-RNA-like (b) graphs classified in the 2004 RAG database. The vertex number, ID (first column) and the second smallest eigenvalue (second column) are shown for each RNA tree graph (third column). RNA secondary structures and their functions are shown in the fourth and fifth columns. In (a), the five identified RNAs that correspond to our candidate topologies with 3 and 4 vertices in 2004 using PAM [32] are shown (C1, C2, C3, C4 and C7).
Figure 6
Figure 6
Clustering plots of PAM and k-NN clustering for 38 RNA dual graphs with 3 and 4 vertices (a versus c) and for 146 RNA dual graphs with 3 to 5 vertices (b versus d). Existing, RNA-like, and non-RNA-like topologies are represented as red, blue, and green, respectively. Each ellipse encloses at least 85% of the RNA-like or non-RNA-like group members. In (a) and (b), the centers of two groups (M1 and M2) are marked with an X and the topologies of RNA-like groups' centers are shown. M1 and M2 emerge as pseudoknot RNAs with universal features, corresponding to structures of tmRNA (PKB234) and a candidate topology similar to the Box H/ACA snoRNA (RF00233). In (a), C1, C2, C3, C4, and C7 are existing topologies which were classified as RNA-like topologies in 2004 using PAM [32] (see Figure 5) and B is a confirmed bridge structure corresponding to U5 spliceosomal RNA (RF00020, see the second row in Figure 2). In (b) and (d), two candidate topologies (C4-1 and C4-2) are shown which are similar to the newly-confirmed existing topology C4 (Tombusvirus 3' UTR region IV). In (c), C5, C6 and C10 remain as RNA-like while C8 and C9 have been changed into non-RNA-like topologies. See Figures 5 and 9 for C1-C10. In (d), eight reclassified candidate pseudoknot topologies (P1-P8, currently RNA-like topologies which were classified non-RNA-like in 2004) are shown; P1, P2 and P3 are similar to the newly-confirmed existing topology Viral 3' UTR (Pseudobase++: PKB169).
Figure 7
Figure 7
The library of topologies for tree graphs between 2 and 10 vertices, with the second smallest Laplacian eigenvalue (λ2) listed. RNA families with sequences belonging to select topologies are listed below their corresponding tree graph. Existing topologies, RNA-like, and non-RNA-like topologies based on the 2010 clustering are represented by red, blue or black (dashed) colors, respectively. For tree graphs with 2-6 vertices, the complete library is shown (all topologies are currently existing in nature). For tree graphs with 7-10 vertices, the partial library of topologies (existing, RNA-like, and non-RNA-like) is shown. See the RAG website for the complete library of topologies with 7-10 vertices.
Figure 8
Figure 8
The library of 71 existing topologies for dual graphs between 2 and 9 vertices: from these 71 topologies, 29, 24, and 18 were classified as existing, RNA-like and non-RNA-like, respectively, in 2004. The vertex number and ID are in parentheses above each graph.
Figure 9
Figure 9
The full library of topologies for dual graphs between 2 and 4 vertices. The current status of existing, RNA-like, and non-RNA-like topologies is represented by red, blue or black (dashed) colors, respectively, based on the new clustering approach. The structural class (tree, bridge or pseudoknot) is also shown for each graph. Ten candidate topologies (C1-C10) in 2004 suggested by PAM are shown in the yellow box. See the RAG website for the library of topologies with 5-9 vertices.

References

    1. Fera D, Kim N, Shiffeldrim N, Zorn J, Laserson U, Gan HH, Schlick T. RAG: RNA-As-Graphs web resource. BMC Bioinformatics. 2004;5:88–97. doi: 10.1186/1471-2105-5-88. - DOI - PMC - PubMed
    1. Gan HH, Fera D, Zorn J, Shiffeldrim N, Tang M, Laserson U, Kim N, Schlick T. RAG: RNA-As-Graphs Database - Concepts, Analysis, and Features. Bioinformatics. 2004;20:1285–1291. doi: 10.1093/bioinformatics/bth084. - DOI - PubMed
    1. Famulok M, Hartig JS, Mayer G. Functional aptamers and aptazymes in biotechnology, diagnostics, and therapy. Chemical Reviews. 2007;107:3715–3743. doi: 10.1021/cr0306743. - DOI - PubMed
    1. Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET. et al.Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. doi: 10.1038/nature05874. - DOI - PMC - PubMed
    1. Pheasant M, Mattick JS. Raising the estimate of functional human sequences. Genome Res. 2007;17:1245–1253. doi: 10.1101/gr.6406307. - DOI - PubMed

Publication types