Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Sep 15;29(18):2300-10.
doi: 10.1093/bioinformatics/btt396. Epub 2013 Jul 11.

Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs

Affiliations

Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs

Laura H LeGault et al. Bioinformatics. .

Abstract

Motivation: Alternative splicing and other processes that allow for different transcripts to be derived from the same gene are significant forces in the eukaryotic cell. RNA-Seq is a promising technology for analyzing alternative transcripts, as it does not require prior knowledge of transcript structures or genome sequences. However, analysis of RNA-Seq data in the presence of genes with large numbers of alternative transcripts is currently challenging due to efficiency, identifiability and representation issues.

Results: We present RNA-Seq models and associated inference algorithms based on the concept of probabilistic splice graphs, which alleviate these issues. We prove that our models are often identifiable and demonstrate that our inference methods for quantification and differential processing detection are efficient and accurate.

Availability: Software implementing our methods is available at http://deweylab.biostat.wisc.edu/psginfer.

Contact: cdewey@biostat.wisc.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
An example gene for which an explicit model of all possible isoform frequencies is not identifiable, whereas a PSG model for the gene is identifiable, given RNA-Seq reads. We assume that the RNA-Seq fragments are shorter than the middle exon and thus that reads from a fragment identify at most one splice junction. (A) The gene model with levels of coverage by RNA-Seq reads indicated above each exon. (B) The four possible isoforms of the gene. (C) and (D) give two (of infinitely many) possible isoform abundances that explain the observed RNA-Seq read coverages equally well. (E) The exon graph PSG for the gene, which is identifiable given this data (the unique ML parameters are above each edge), assuming the exon sequences are relatively unique
Fig. 2.
Fig. 2.
Example PSG representations for the mouse gene Gfra4. (A) A UCSC Genome Browser visualization of the seven annotated isoforms of this gene. (B) The line graph PSG. (C) The first-order exon graph. (D) A higher-order exon graph. In this graph, the AP events immediately following the longest exon are allowed to depend on the AP event directly preceding the exon, in contrast to the first-order exon graph, in which these AP events are independent of each other, given that the longest exon is included in the transcript. (E) An unfactorized PSG
Fig. 3.
Fig. 3.
Distributions of the differences between the parameter estimates of EM and JR from single and paired-end data
Fig. 4.
Fig. 4.
The mean distances of parameter estimates on bootstrap samples from those on the full read set as a function of the bootstrapped read sample size. All differences between pairs of methods at each read set size are significant (formula image, sign test) except for that between Single EM and Paired JR for read set size = 10

Similar articles

Cited by

References

    1. Bohnert R, et al. Transcript quantification with RNA-Seq data. BMC Bioinformatics. 2009;10(Suppl. 13):P5.
    1. Bollina D, et al. ASGS: an alternative splicing graph web service. Nucleic Acids Res. 2006;34:W444–W447. - PMC - PubMed
    1. Chang H, et al. The application of alternative splicing graphs in quantitative analysis of alternative splicing form from EST database. Int. J. Comput. Appl. Technol. 2005;22:14–22.
    1. Cherbas L, et al. The transcriptional diversity of 25 Drosophila cell lines. Genome Res. 2011;21:301–314. - PMC - PubMed
    1. Dye MJ, et al. Exon tethering in transcription by RNA polymerase II. Mol. Cell. 2006;21:849–859. - PubMed

Publication types