Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 28;7(5):556-561.e3.
doi: 10.1016/j.cels.2018.10.007. Epub 2018 Nov 14.

BioJupies: Automated Generation of Interactive Notebooks for RNA-Seq Data Analysis in the Cloud

Affiliations

BioJupies: Automated Generation of Interactive Notebooks for RNA-Seq Data Analysis in the Cloud

Denis Torre et al. Cell Syst. .

Abstract

BioJupies is a web application that enables the automated creation, storage, and deployment of Jupyter Notebooks containing RNA-seq data analyses. Through an intuitive interface, novice users can rapidly generate tailored reports to analyze and visualize their own raw sequencing files, gene expression tables, or fetch data from >9,000 published studies containing >300,000 preprocessed RNA-seq samples. Generated notebooks have the executable code of the entire pipeline, rich narrative text, interactive data visualizations, differential expression, and enrichment analyses. The notebooks are permanently stored in the cloud and made available online through a persistent URL. The notebooks are downloadable, customizable, and can run within a Docker container. By providing an intuitive user interface for notebook generation for RNA-seq data analysis, starting from the raw reads all the way to a complete interactive and reproducible report, BioJupies is a useful resource for experimental and computational biologists. BioJupies is freely available as a web-based application from http://biojupies.cloud.

Keywords: Data Commons; Jupyter Notebook; RNA-seq; bioinformatics; data visualization; enrichment analysis; pipeline; systems biology.

PubMed Disclaimer

Conflict of interest statement

Declaration of Interests

The authors declare no competing interests.

Figures

Fig. 1.
Fig. 1.. Schematic illustration of the BioJupies notebook generation workflow.
The user starts by uploading RNA-seq data to the BioJupies website (http://biojupies.cloud), or by selecting from thousands of publicly available datasets. If raw FASTQ files are provided, expression levels for each gene are quantified using a cloud-based quantification pipeline. The user subsequently selects the tools and parameters to apply to analyze the data. Finally, a server generates a Jupyter Notebook with the desired settings and returns a report to the user through a persistent URL. See also Figure S1.

References

    1. Amstutz P, Crusoe MR, Tijanić N, Chapman B, Chilton J, Heuer M, Kartashov A, Leehr D, Ménager H, and Nedeljkovich M (2016). Common Workflow Language, v1. 0.
    1. Baumer B, Cetinkaya-Rundel M, Bray A, Loi L, and Horton NJ (2014). R Markdown: Integrating a reproducible analysis tool into introductory statistics. arXiv preprint arXiv:14021894.
    1. Bray NL, Pimentel H, Melsted P, and Pachter L (2016). Near-optimal probabilistic RNA-seq quantification. Nature biotechnology 34, 525. - PubMed
    1. Chang W, Cheng J, Allaire JJ, Xie Y, and McPherson J (2015). Shiny: web application framework for R. R package version 011 1, 106.
    1. Clark NR, Hu KS, Feldmann AS, Kou Y, Chen EY, Duan Q, and Ma’ayan A (2014). The characteristic direction: a geometrical approach to identify differentially expressed genes. BMC bioinformatics 15, 79. - PMC - PubMed

Publication types

LinkOut - more resources