The RNA structure alignment ontology

James W Brown, Amanda Birmingham, Paul E Griffiths, Fabrice Jossinet, Rym Kachouri-Lafond, Rob Knight, B Franz Lang, Neocles Leontis, Gerhard Steger, Jesse Stombaugh, Eric Westhof

PMID: 19622678
PMCID: PMC2743057
DOI: 10.1261/rna.1601409

The RNA structure alignment ontology

James W Brown et al. RNA. 2009 Sep.

. 2009 Sep;15(9):1623-31.

doi: 10.1261/rna.1601409. Epub 2009 Jul 21.

Authors

James W Brown, Amanda Birmingham, Paul E Griffiths, Fabrice Jossinet, Rym Kachouri-Lafond, Rob Knight, B Franz Lang, Neocles Leontis, Gerhard Steger, Jesse Stombaugh, Eric Westhof

PMID: 19622678
PMCID: PMC2743057
DOI: 10.1261/rna.1601409

Abstract

Multiple sequence alignments are powerful tools for understanding the structures, functions, and evolutionary histories of linear biological macromolecules (DNA, RNA, and proteins), and for finding homologs in sequence databases. We address several ontological issues related to RNA sequence alignments that are informed by structure. Multiple sequence alignments are usually shown as two-dimensional (2D) matrices, with rows representing individual sequences, and columns identifying nucleotides from different sequences that correspond structurally, functionally, and/or evolutionarily. However, the requirement that sequences and structures correspond nucleotide-by-nucleotide is unrealistic and hinders representation of important biological relationships. High-throughput sequencing efforts are also rapidly making 2D alignments unmanageable because of vertical and horizontal expansion as more sequences are added. Solving the shortcomings of traditional RNA sequence alignments requires explicit annotation of the meaning of each relationship within the alignment. We introduce the notion of "correspondence," which is an equivalence relation between RNA elements in sets of sequences as the basis of an RNA alignment ontology. The purpose of this ontology is twofold: first, to enable the development of new representations of RNA data and of software tools that resolve the expansion problems with current RNA sequence alignments, and second, to facilitate the integration of sequence data with secondary and three-dimensional structural information, as well as other experimental information, to create simultaneously more accurate and more exploitable RNA alignments.

PubMed Disclaimer

Figures

**FIGURE 1.**
Abstract example of an RNA sequence alignment showing typical features. This simplified diagram shows many features common in sequence alignments, including representation of paired and unpaired regions, gaps, kinds of loops, etc. Some features can be conveniently represented using existing software. Others, such as noncanonical bases, cannot.

**FIGURE 2.**
Example RNA sequence alignment. This example is helix P3 and the adjacent joining regions in RNase P RNA from representative Archaea. The first seven rows are annotations. Rows *1–4* are standard numbering, relative to the *Methanothermobacter thermoautotrophicus* RNA. Row 5 contains human-readable secondary structure labels. Columns are indicated in the second and third rows. Row 6 is the machine-readable base-pairing mask. Row 7 is a human-readable guide to the pairings specified in the previous row; column “A” pairs with “A,” “B” pairs with “B,” etc. The remaining rows are individual sequences; data taken from the RNase P Database (Brown 1999).

**FIGURE 3.**
Example bacterial RNase P RNA secondary structures and correspondences. (A) The correspondence relationship between two conceptual RNA sequences; corresponding nucleotides (all that is possible in a traditional sequence alignment), corresponding regions, corresponding base pairs, and corresponding helices. (B) These types of relationships in the context of the secondary structure of RNase P RNA. Type B RNase P RNA is represented by that of *Bacillus subtilus* strain 168, and type A RNase P RNA is represented by that of *Escherichia coli* strain K12 W3110. Helices are numbered P1–P19 according to Haas et al. (1994). Taken from the RNase P Database (Brown 1999).

**FIGURE 4.**
Example RNA sequence/structure alignment. This is the same alignment as shown in Figure 2 with explicit correspondence between nucleotides shown in blue and explicit correspondence between regions shown with red boxes. Correspondence relations between base pairs and helices are not displayed here. Note that indels (gaps) are not required.

See this image and copyright information in PMC

References

1. Andersen ES, Rosenblad MA, Larsen N, Westergaard JC, Burks J, Wower IK, Wower J, Gorodkin J, Samuelsson T, Zwieb C. The tmRDB and SRPDB resources. Nucleic Acids Res. 2006;34:D163–D168. - PMC - PubMed
1. Bendana YR, Holmes IH. Colorstock, SScolor, Raton: RNA alignment visualization tools. Bioinformatics. 2008;24:579–580. - PMC - PubMed
1. Brown JW. The Ribonuclease P Database. Nucleic Acids Res. 1999;27:314. doi: 10.1093/nar/27.1.314. - DOI - PMC - PubMed
1. Burke JM, Belfort M, Cech TR, Davies RW, Schweyen RJ, Shub DA, Szostak JW, Tabak HF. Structural conventions for group I introns. Nucleic Acids Res. 1987;15:7217–7221. - PMC - PubMed
1. Eilbeck K, Lewis SE, Mungall CJ, Yandell M, Stein L, Durbin R, Ashburner M. The Sequence Ontology: A tool for the unification of genome annotations. Genome Biol. 2005;6:R44. doi: 10.1186/gb-2005-6-5-r44. - DOI - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The RNA structure alignment ontology

The RNA structure alignment ontology

Authors

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources