Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Nov 18:3:282.
doi: 10.12688/f1000research.5654.1. eCollection 2014.

Interpretation of mRNA splicing mutations in genetic disease: review of the literature and guidelines for information-theoretical analysis

Affiliations
Review

Interpretation of mRNA splicing mutations in genetic disease: review of the literature and guidelines for information-theoretical analysis

Natasha Caminsky et al. F1000Res. .

Abstract

The interpretation of genomic variants has become one of the paramount challenges in the post-genome sequencing era. In this review we summarize nearly 20 years of research on the applications of information theory (IT) to interpret coding and non-coding mutations that alter mRNA splicing in rare and common diseases. We compile and summarize the spectrum of published variants analyzed by IT, to provide a broad perspective of the distribution of deleterious natural and cryptic splice site variants detected, as well as those affecting splicing regulatory sequences. Results for natural splice site mutations can be interrogated dynamically with Splicing Mutation Calculator, a companion software program that computes changes in information content for any splice site substitution, linked to corresponding publications containing these mutations. The accuracy of IT-based analysis was assessed in the context of experimentally validated mutations. Because splice site information quantifies binding affinity, IT-based analyses can discern the differences between variants that account for the observed reduced (leaky) versus abolished mRNA splicing. We extend this principle by comparing predicted mutations in natural, cryptic, and regulatory splice sites with observed deleterious phenotypic and benign effects. Our analysis of 1727 variants revealed a number of general principles useful for ensuring portability of these analyses and accurate input and interpretation of mutations. We offer guidelines for optimal use of IT software for interpretation of mRNA splicing mutations.

Keywords: genetic disease; mRNA; mutation; rare disease; splicing.

PubMed Disclaimer

Conflict of interest statement

Competing interests: PKR is the inventor of US Patent 5,867,402 and other patents pending, which underlie the prediction and validation of mutations. He is one of the founders of Cytognomix, Inc. which is developing software based on this technology for complete genome or exome splicing mutation analysis.

Figures

Figure 1.
Figure 1.. Distribution of deleterious natural site variants relative to information content.
A) The sequence logo for human acceptor and donor splice sites based on the positive (+) strand of the October 2000 (hg5) genome draft. The logo shows the distribution of information contents ( R i in bits) at each position over the region of 28 nucleotides for acceptor [-25, +2] and 10 nucleotides for donor [-3, +6] from the first nucleotide of the splice junction (position 0). Nucleotide height represents its frequency at that position. The horizontal bar atop each stack indicates the standard deviation at that position. This figure was modified from Rogan et al. (2003) to include splice sites in genes on both strands of the annotated human reference genome . B) The distribution of deleterious single-nucleotide variants reported at the natural acceptor (left) and donor (right) splice sites. The variants used to populate this graph ( Supplementary Table 8) were included only if they were reported to negatively affect splicing (N = 419 for acceptors, 599 for donors). The image was aligned to the sequence logo ( A) to illustrate potential correlation of number of splicing variants at a position to the information content at that position.
Figure 2.
Figure 2.. Sample retrieval of average change in information content (Δ R i) with splicing mutation calculator (SMC) for published mutations.
A) Example mutation input for SMC (T>A at the 3 rd intronic position of natural acceptor). The type of splice site is selected by clicking on the corresponding sequence logo (acceptor [left] or donor [right]). The purple slider bar appearing below the logo is used to select the position of the mutation. The reference and mutant nucleotides are then designated, and the variant is submitted to the software (‘Submit your selection’). SMC outputs a table indicating the user input, the number of instances in the literature where this substitution has been analyzed using IT, and the computed Δ R i values (in bits) using both the old (1992; top) and new (2003; bottom) ribls. The cell color for Δ R i values indicates the predicted severity of the inputted variant according to defined thresholds , . B) Tabular output detailing each instance of the selected mutation from the source table. The user may view, in a separate window, extensive details of all variants referred to in SMC output ( Supplementary Table 10).
Figure 3.
Figure 3.. Ribl used for the prediction of a variant’s effect on branch-point sites.
Sequence logo for information model for the branch-point site, created using 20 annotated branch-point sequences.
Figure 4.
Figure 4.. Outcomes of cryptic splicing mutations.
A prototypical internal exon (in purple) with flanking exons (in blue); introns are represented by black solid, and dashed lines (top). The three types of cryptic splice site activation are then illustrated. Type 1 cryptic splice site activation (left) is caused by the activation (green arrow) of a cryptic site by strengthening a pre-existing site, or by creating a novel splice site (blue). Type 2 (middle) results from the simultaneous weakening or abolition (red arrow) of the natural splice site while strengthening or creating (green arrow) a cryptic site. Type 3 (right) involves the activation of a pre-existing cryptic site due to the weakening or abolition of the natural splice site (indicated by orange triangle). The number of cases that have been reported in the literature that have been analyzed by IT for each type is indicated, with the percent accuracy in parentheses. The bottom row represents the resulting mRNA structure due to the activated cryptic splice site.
Figure 5.
Figure 5.. Distribution of activated cryptic sites.
The frequency of validated cryptic splice acceptors ( A) and donors ( B) occurring at positions relative to the natural splice site. Positions are given using ASSEDA coordinates. Lower panel expands the cryptic site distribution of the region circumscribing the natural splice site.

Similar articles

Cited by

References

    1. Kan Z, Rouchka EC, Gish WR, et al. : Gene structure prediction and alternative splicing analysis using genomically aligned ESTs. Genome Res. 2001;11(5):889–900. 10.1101/gr.155001 - DOI - PMC - PubMed
    1. Modrek B, Resch A, Grasso C, et al. : Genome-wide detection of alternative splicing in expressed sequences of human genes. Nucleic Acids Res. 2001;29(13):2850–2859. 10.1093/nar/29.13.2850 - DOI - PMC - PubMed
    1. Vandenbroucke I, Callens T, De Paepe A, et al. : Complex splicing pattern generates great diversity in human NF1 transcripts. BMC Genomics. 2002;3:13. 10.1186/1471-2164-3-13 - DOI - PMC - PubMed
    1. Frilander MJ, Steitz JA: Initial recognition of U12-dependent introns requires both U11/5’ splice-site and U12/branchpoint interactions. Genes Dev. 1999;13(7):851–863. 10.1101/gad.13.7.851 - DOI - PMC - PubMed
    1. Will CL, Lührmann R: Protein functions in pre-mRNA splicing. Curr Opin Cell Biol. 1997;9(3):320–328. 10.1016/S0955-0674(97)80003-8 - DOI - PubMed