Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 18;22(1):195.
doi: 10.1186/s12864-021-07469-6.

WASP: a versatile, web-accessible single cell RNA-Seq processing platform

Affiliations

WASP: a versatile, web-accessible single cell RNA-Seq processing platform

Andreas Hoek et al. BMC Genomics. .

Abstract

Background: The technology of single cell RNA sequencing (scRNA-seq) has gained massively in popularity as it allows unprecedented insights into cellular heterogeneity as well as identification and characterization of (sub-)cellular populations. Furthermore, scRNA-seq is almost ubiquitously applicable in medical and biological research. However, these new opportunities are accompanied by additional challenges for researchers regarding data analysis, as advanced technical expertise is required in using bioinformatic software.

Results: Here we present WASP, a software for the processing of Drop-Seq-based scRNA-Seq data. Our software facilitates the initial processing of raw reads generated with the ddSEQ or 10x protocol and generates demultiplexed gene expression matrices including quality metrics. The processing pipeline is realized as a Snakemake workflow, while an R Shiny application is provided for interactive result visualization. WASP supports comprehensive analysis of gene expression matrices, including detection of differentially expressed genes, clustering of cellular populations and interactive graphical visualization of the results. The R Shiny application can be used with gene expression matrices generated by the WASP pipeline, as well as with externally provided data from other sources.

Conclusions: With WASP we provide an intuitive and easy-to-use tool to process and explore scRNA-seq data. To the best of our knowledge, it is currently the only freely available software package that combines pre- and post-processing of ddSEQ- and 10x-based data. Due to its modular design, it is possible to use any gene expression matrix with WASP's post-processing R Shiny application. To simplify usage, WASP is provided as a Docker container. Alternatively, pre-processing can be accomplished via Conda, and a standalone version for Windows is available for post-processing, requiring only a web browser.

Keywords: 10x; Barcode; R shiny; RNA-Seq; Single cell; Snakemake; UMI; ddSEQ.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Overview of the analysis steps featured by WASP and the tools used for the individual tasks. a Shows the four main pre-processing steps of the Snakemake workflow as orange squares and the tools used for the processing in blue squares. b Shows the R Shiny-based steps from the post-processing as orange squares and the single cell specific R packages used during the analysis as blue squares
Fig. 2
Fig. 2
Schematic overview of a WASP analysis. As a first step, users start the Snakemake workflow on a Linux-based system, providing the FASTQ file with the reads and a reference genome with corresponding annotation. The results (quality metrics) of the workflow are then presented in an R Shiny web application which generates a gene expression matrix CSV file containing UMIs per gene and cell. This file or a similar externally generated file are then uploaded to the post-processing Shiny web application for further processing. Post-processing can be performed in an automated or manual mode and presented as a dynamic web page similar to the pre-processing results
Fig. 3
Fig. 3
Examples of WASP pre-processing quality metrics. a A summary page shows quality metrics about identified barcodes, STAR mapping and featureCounts analyses. b Results for each analysis step, e.g. the STAR mapping, are presented in a detailed page as well with interactive selection of metrics (e.g. categories of mapped and unmapped reads) c On the last page, users select the number of barcodes to be used for further analysis. By calculating a knee plot, WASP provides users with a suggested number of detected true-positive barcodes
Fig. 4
Fig. 4
Examples of WASP post-processing analyses. a 2D UMAP plot of clustered cells with detailed information about a selected cell. b After starting the Shiny web application in manual mode, users have to select a threshold defining below which number of UMI counts a cell is discarded. c The elbow plot showing the standard deviation of each principal component and a calculated recommended cutoff (red dot and arrow) for use in the following analyses. Users can select custom values for the clustering resolution and the number of principial components used for the following analysis steps

Similar articles

Cited by

References

    1. Tang F, Barbacioru C, Wang Y, Nordman E, Lee C, Xu N, Wang X, Bodeau J, Tuch B, Siddiqui A, Lao K, Surani M. mRNA-Seq whole-transcriptome analysis of a single cell. Nat Methods. 2009;6:377–382. doi: 10.1038/nmeth.1315. - DOI - PubMed
    1. Tang X, Huang Y, Lei J, Luo H, Zhu X. The single-cell sequencing: new developments and medical applications. Cell Biosci. 2019;9(1):53. doi: 10.1186/s13578-019-0314-y. - DOI - PMC - PubMed
    1. Angerer P, Simon L, Tritschler S, Wolf F, Fischer D, Theis F. Single cells make big data: new challenges and opportunities in transcriptomics. Curr Opin Syst Biol. 2017;4:85–91. doi: 10.1016/j.coisb.2017.07.004. - DOI
    1. Van Der Maaten L, Hinton G. Visualizing Data using t-SNE. J Mach Learn Res. 2008;9:2579–2605.
    1. McInnes L, Healy J. UMAP: uniform manifold approximation and projection for dimension reduction. 2018.