Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr;18(4):481-498.
doi: 10.1080/15476286.2020.1817266. Epub 2020 Sep 20.

Systematics for types and effects of RNA variations

Affiliations

Systematics for types and effects of RNA variations

Mauno Vihinen. RNA Biol. 2021 Apr.

Abstract

Systematics is described for annotation of variations in RNA molecules. The conceptual framework is part of Variation Ontology (VariO) and facilitates depiction of types of variations, their functional and structural effects and other consequences in any RNA molecule in any organism. There are more than 150 RNA related VariO terms in seven levels, which can be further combined to generate even more complicated and detailed annotations. The terms are described together with examples, usually for variations and effects in human and in diseases. RNA variation type has two subcategories: variation classification and origin with subterms. Altogether six terms are available for function description. Several terms are available for affected RNA properties. The ontology contains also terms for structural description for affected RNA type, post-transcriptional RNA modifications, secondary and tertiary structure effects and RNA sugar variations. Together with the DNA and protein concepts and annotations, RNA terms allow comprehensive description of variations of genetic and non-genetic origin at all possible levels. The VariO annotations are readable both for humans and computer programs for advanced data integration and mining.

Keywords: RNA; RNA variation classification; VariO; systematics; variation ontology.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
RNA variation types and division to RNA variation classification and variation origin terms. The hierarchy of the terms is indicated by indentation
Figure 2.
Figure 2.
Examples of RNA chain variations. The original sequence is in the centre. In the variant sequences the original bases at original positions are underlined. In the coding region, deletions, indels and insertions are either in-frame or out-of-frame type. Nonsense variation that introduces a new, premature stop codon is not included. Similarly, RNA splicing change is omitted, see Fig. 5 for details
Figure 3.
Figure 3.
Affected RNA types of non-coding and protein-coding terms. The hierarchy of the terms is indicated by indentation
Figure 4.
Figure 4.
Terms describing structural variations. Note that details for affected RNA types are in Fig. 3. The hierarchy of the terms is indicated by indentation
Figure 5.
Figure 5.
mRNA forms and mechanisms causing them. (A) The mRNA molecule (in the centre) can be modified in many ways. Exons are shown as boxes with different colours, introns are indicated with a thin line. mRNA molecules can have alternative initiation and termination positions, and the polyadenylation can start at different sites. mRNA bases can be modified. During splicing introns are cleaved. cis-Splicing is the most common splicing even and occurs within a single hnRNA molecule. In constitutive splicing all exons are included. Exon skipping means that one or more exons are excluded from the mature mRNA. It can appear also as mutually exclusive exons where only one of two exons is included to the final product. When a cryptic splice site is activated a new cryptic exon out of an intron may be included. Intron fragment or entire intron can be retained in the sequence. Variations can lead also to loss of exon fragment. In trans-splicing exons from different mRNA molecules are combined to form a chimeric RNA. (B) Constitutive splicing (top) and exon skipping (bottom). Exon skipping can occur due to several reasons. It may be normal variation between cells or tissues or dependent on the cellular developmental situation. Variations at splice site or at their surrounding, such as in exonic splicing enhancer, can lead to exon skipping. (C) Inclusion of intronic sequence to mature mRNA due to alternative 3ʹ acceptor (top left) or 5ʹ donor (top middle) splice sites, or because of novel splice site formation inside an intron (top right). The alternative splice sites can appear either on exon or intron. Mutually exclusive splicing (bottom) produces two forms that contain only one of two alternative exons (red and black lines). (D) Inclusion of cryptic exon due to variation at splice site or at a site activating the novel splice site
Figure 6.
Figure 6.
Three dimensional and simplified ladder models for three-dimensional structures of RNA secondary structural elements. (A) Stem (cyan) and loop (pink) connecting the strands in the loop of 3ʹ conserved region of eel LINE element UnaL2 (PDB entry 1wks [154]). (B) Bulge (pink) in non-coding prohead RNA from GA1 bacteriophage, which is involved in metal ion binding (2nci [155]). (C) Asymmetric internal loops A (yellow) and B (pink) in SL1 domain in human immunodeficiency virus HIV1 packaging signal (1m5l [156]). HIV is an RNA virus. (D) Pseudoknot in human telomerase RNA (2k96 [157]. The two stems are indicated in yellow and cyan, and the two loops in pink and dark blue, respectively. (E) Multiloop structure in RNA tertiary domain essential to hepatitis C virus (HCV) internal ribosome entry site (IRES) -mediated translation initiation (1kh6 [158]). The four stems are indicated in cyan, red, green and yellow. In the case of ensemble of structures, the representative chain was selected. The 2D structures were drawn with forna based on force-directed graph layout [159] and 3D structures were drawn with UCSF Chimera [160]
Figure 7.
Figure 7.
RNA structures. (A) Double-stranded RNA helix (6IA2 [167]) in a self-complementary RNA duplex recognized by bacteriophage Mu zinc finger protein Com. (B) RNA triple helix in telomerase TER ribonucleoprotein complex RNA component (2K95 [157]). (C) G-quadruplex is a form of four-stranded RNA. The structure is for human telomeric RNA (2KBP [166]). (D) RNA-DNA complex of Cpf1 endonuclease R-loop complex (5MGA [170]). RNA chain in pink and DNA chains in cyan. The large protein component of the complex is not shown

References

    1. Brosius J, Raabe CA.. What is an RNA? A top layer for RNA classification. RNA Biol. 2016;13:140–144. - PMC - PubMed
    1. Vihinen M. Variation Ontology for annotation of variation effects and mechanisms. Genome Res. 2014a;24:356–364. - PMC - PubMed
    1. Vihinen M. Types and effects of protein variations. Hum Genet. 2015b;134:405–421. - PubMed
    1. Vihinen M. Systematics for types and effects of DNA variations. BMC Genomics. 2018;19:974. - PMC - PubMed
    1. Vihinen M. Variation Ontology: annotator guide. J Biomed Semantics. 2014b;5:9. - PMC - PubMed

Publication types

LinkOut - more resources