Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct 18;17(1):423.
doi: 10.1186/s12859-016-1254-8.

Roar: detecting alternative polyadenylation with standard mRNA sequencing libraries

Affiliations

Roar: detecting alternative polyadenylation with standard mRNA sequencing libraries

Elena Grassi et al. BMC Bioinformatics. .

Abstract

Background: Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for example in regulation by microRNAs and cellular localization of the transcripts thus altering their fate.

Results: We propose a strategy to identify the genes undergoing regulation of 3' UTR length using RNA sequencing data obtained from standard libraries, thus widely applicable to data originally obtained to perform classical differential expression analyses. We decided to exploit previously annotated APA sites from public databases, in contrast with other approaches recently proposed in which the location of the APA site is inferred from the data together with the relative abundance of the isoforms. We demonstrate the reliability of our method by comparing it to the results of other microarray based or specific RNA-seq libraries methods and show that using APA sites databases results in higher sensitivity compared to de novo site prediction approach.

Conclusions: We implemented the algorithm in a Bioconductor package to facilitate its broad usage in the scientific community. The ability of this approach to detect shortening from libraries with a number of reads comparable to that needed for differential expression analyses makes it useful for investigating if alternative polyadenylation is relevant in a certain biological process without requiring specific experimental assays.

Keywords: 3’ UTR; Bioconductor; Polyadenylation; RNA-sequencing; Software.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
APASdb reported sites across tissues. a - mean (+/- sem) of the fraction of reads assigned to the two sites with more reads for every gene with at least two overlapping sites in APASdb across different tissues. For “alltogether” we put together sites annotations for all the 22 human normal tissues, normalizing reads with respect to the total number of reads found in that tissue and considering the sites supported by more than 2 normalized reads. b - percentages of sites found in different tissues that are found in other N tissues: on average 29 % of sites are found in all tissues. 50 % of the sites are found in at least 17 tissues
Fig. 2
Fig. 2
Pipeline. a - how we define gene structures starting from different transcripts. We obtain “melted genes” collapsing together the structures of all the transcripts assigned to a gene. aPA: alternative polyadenylation site. cPa: canonical polyadenylation site. Thicker blue rectangles represent coding exons, while the others depict untranslated regions. b - an example of how roar works with the single APA annotation: in sample #1 the shorter isoform is more expressed than the longer one with respect to sample #2. Blue wavy shapes represent aligned mRNAseq reads. c - how transcript fragments are defined in multiple APA analyses to efficiently count reads for all the possible APA choices. aPA1-2-3 are three different APA sites reported for this sample gene
Fig. 3
Fig. 3
Example of read density and corresponding m/M values. a- Sashimi plot produced with IGV of two alignments for representative samples for testes and brain over the PRE and POST portions that we consider for CAMSAP1, one of the genes with the strongest shortening signal in testes versus brain. Read density is clearly lower in testes on the POST portion. CAMSAP1 is on the negative strand and the PRE fragment overlaps with the coding portion and the beginning of the 3’UTR of its last exon. b- Dot plot representing the m/M values obtained for the two testes and six brain samples. The larger m/M values for testes reflect the preferential expression of the short isoform in that tissue
Fig. 4
Fig. 4
Venn diagrams of overlaps between roar results and a standard approach and between two different annotations for roar. a - MCF7 vs MCF10: overlap between shortened genes for roar and [13] b - MCF7 vs MCF10: overlap between shortened genes for roar using PolyA_DB or APASdb

References

    1. Tian B, Manley JL. Alternative cleavage and polyadenylation: the long and short of it, Trends Biochem Sci. 2013;38(6):312–20. doi: 10.1016/j.tibs.2013.03.005. - DOI - PMC - PubMed
    1. Proudfoot NJ. Ending the message: poly(A) signals then and now. Gene Dev. 2011;25(17):1770–82. doi: 10.1101/gad.17268411. - DOI - PMC - PubMed
    1. Elkon R, Ugalde AP, Agami R. Alternative cleavage and polyadenylation: extent, regulation and function. Nat Rev Genet. 2013;14(7):496–506. doi: 10.1038/nrg3482. - DOI - PubMed
    1. Zhang H, Lee JY, Tian B. Biased alternative polyadenylation in human tissues, Genome Biol. 2005;6(12):100. doi: 10.1186/gb-2005-6-12-r100. - DOI - PMC - PubMed
    1. Lianoglou S, Garg V, Yang JL, Leslie CS, Mayr C. Ubiquitously transcribed genes use alternative polyadenylation to achieve tissue-specific expression, Gene Dev. 2013;27(21):2380–96. doi: 10.1101/gad.229328.113. - DOI - PMC - PubMed