Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 1;34(15):2521-2529.
doi: 10.1093/bioinformatics/bty110.

TAPAS: tool for alternative polyadenylation site analysis

Affiliations

TAPAS: tool for alternative polyadenylation site analysis

Ashraful Arefeen et al. Bioinformatics. .

Abstract

Motivation: The length of the 3' untranslated region (3' UTR) of an mRNA is essential for many biological activities such as mRNA stability, sub-cellular localization, protein translation, protein binding and translation efficiency. Moreover, correlation between diseases and the shortening (or lengthening) of 3' UTRs has been reported in the literature. This length is largely determined by the polyadenylation cleavage site in the mRNA. As alternative polyadenylation (APA) sites are common in mammalian genes, several tools have been published recently for detecting APA sites from RNA-Seq data or performing shortening/lengthening analysis. These tools consider either up to only two APA sites in a gene or only APA sites that occur in the last exon of a gene, although a gene may generally have more than two APA sites and an APA site may sometimes occur before the last exon. Furthermore, the tools are unable to integrate the analysis of shortening/lengthening events with APA site detection.

Results: We propose a new tool, called TAPAS, for detecting novel APA sites from RNA-Seq data. It can deal with more than two APA sites in a gene as well as APA sites that occur before the last exon. The tool is based on an existing method for finding change points in time series data, but some filtration techniques are also adopted to remove change points that are likely false APA sites. It is then extended to identify APA sites that are expressed differently between two biological samples and genes that contain 3' UTRs with shortening/lengthening events. Our extensive experiments on simulated and real RNA-Seq data demonstrate that TAPAS outperforms the existing tools for APA site detection or shortening/lengthening analysis significantly.

Availability and implementation: https://github.com/arefeen/TAPAS.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Performance of the tools in APA site detection on simulated data with different sequencing depths. (a) The sensitivity and (b) the precision
Fig. 2.
Fig. 2.
Number of correct APA sites detected by different tools on the real dataset when the flexible range for matching a predicted APA site to a true APA site of 3-Seq is 50 bps (a) and 100 bps (b)
Fig. 3.
Fig. 3.
Performance of TAPAS, Cuffdiff, DESeq and DEXSeq in differential expression analysis in terms of sensitivity (a) and precision (b). Cuffdiff_anno denotes running Cuffdiff with the transcriptome annotation and DEXSeq_gene denotes running DEXSeq to detect DE genes (instead of DE APA sites)
Fig. 4.
Fig. 4.
Performance of TAPAS, DaPars and ChangePoint on detecting genes with shortening/lengthening events in terms of sensitivity (a) and precision (b)

References

    1. Anders S., Huber W. (2010) Differential expression analysis for sequence count data. Genome Biol., doi: 10.1186/gb-2010-11-10-r106. - PMC - PubMed
    1. Anders S. et al. (2012) Detecting differential usage of exons from RNA-seq data. Genome Res., 22, 2008–2017. - PMC - PubMed
    1. Andrew H.B. et al. (2010) 3′-end sequencing for expression quantification (3SEQ) from archival tumor samples. PLoS One, doi: 10.1371/journal.pone.0008768. - PMC - PubMed
    1. Bahn J.H. et al. (2015) Genomic Analysis of ADAR1 binding and its involvement in multiple RNA processing pathways. Nat. Commun., doi: 10.1038/ncomms7355. - PMC - PubMed
    1. Barrett L.W. et al. (2012) Regulation of eukaryotic gene expression by the untranslated gene regions and other non-coding elements. Cell. Mol. Life Sci., 69, 3613–3634. - PMC - PubMed

Publication types