Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Apr 15;36(8):2466-2473.
doi: 10.1093/bioinformatics/btz932.

Platform-integrated mRNA isoform quantification

Affiliations

Platform-integrated mRNA isoform quantification

Jiao Sun et al. Bioinformatics. .

Abstract

Motivation: Accurate estimation of transcript isoform abundance is critical for downstream transcriptome analyses and can lead to precise molecular mechanisms for understanding complex human diseases, like cancer. Simplex mRNA Sequencing (RNA-Seq) based isoform quantification approaches are facing the challenges of inherent sampling bias and unidentifiable read origins. A large-scale experiment shows that the consistency between RNA-Seq and other mRNA quantification platforms is relatively low at the isoform level compared to the gene level. In this project, we developed a platform-integrated model for transcript quantification (IntMTQ) to improve the performance of RNA-Seq on isoform expression estimation. IntMTQ, which benefits from the mRNA expressions reported by the other platforms, provides more precise RNA-Seq-based isoform quantification and leads to more accurate molecular signatures for disease phenotype prediction.

Results: In the experiments to assess the quality of isoform expression estimated by IntMTQ, we designed three tasks for clustering and classification of 46 cancer cell lines with four different mRNA quantification platforms, including newly developed NanoString's nCounter technology. The results demonstrate that the isoform expressions learned by IntMTQ consistently provide more and better molecular features for downstream analyses compared with five baseline algorithms which consider RNA-Seq data only. An independent RT-qPCR experiment on seven genes in twelve cancer cell lines showed that the IntMTQ improved overall transcript quantification. The platform-integrated algorithms could be applied to large-scale cancer studies, such as The Cancer Genome Atlas (TCGA), with both RNA-Seq and array-based platforms available.

Availability and implementation: Source code is available at: https://github.com/CompbioLabUcf/IntMTQ.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Scatter plots of gene expression and isoform expression estimated by RNA-Seq and other two platforms. (A and B) show the correlation of gene expressions between RNA-Seq and NanoString/Exon-array. (C and D) show the correlation of isoform expressions between RNA-Seq and NanoString/Exon-array. eXpress (Roberts and Pachter, 2013) with sequence-specific bias correction was applied for isoform/gene expression quantification with RNA-Seq data
Fig. 2.
Fig. 2.
Cancer cell line clustering by 100 marker isoforms estimated by IntMTQ. The black dashed horizontal lines separate the clusters of cancer cell lines. The four clusters from top to bottom are colon, breast, ovarian and lung cancer cell lines. The solid vertical blue lines indicate the isoform clusters derived by hierarchical clustering. The official gene symbols with the RefSeq isoform names in the parentheses are listed at the bottom

References

    1. Barretina J. et al. (2012) The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature, 483, 603–607. - PMC - PubMed
    1. Bray N.L. et al. (2016) Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol., 34, 525–527. - PubMed
    1. Castillo D. et al. (2017) Integration of RNA-seq data with heterogeneous microarray data for breast cancer profiling. BMC Bioinf., 18, 506. - PMC - PubMed
    1. Chang J.-W. et al. (2018) An integrative model for alternative polyadenylation, IntMAP, delineates mTOR-modulated endoplasmic reticulum stress response. Nucleic Acids Res., 46, 5996–6008. - PMC - PubMed
    1. Conesa A. et al. (2016) A survey of best practices for RNA-seq data analysis. Genome Biol., 17, 13. - PMC - PubMed

Publication types