Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 30:19:3796-3798.
doi: 10.1016/j.csbj.2021.06.043. eCollection 2021.

UCell: Robust and scalable single-cell gene signature scoring

Affiliations

UCell: Robust and scalable single-cell gene signature scoring

Massimo Andreatta et al. Comput Struct Biotechnol J. .

Abstract

UCell is an R package for evaluating gene signatures in single-cell datasets. UCell signature scores, based on the Mann-Whitney U statistic, are robust to dataset size and heterogeneity, and their calculation demands less computing time and memory than other available methods, enabling the processing of large datasets in a few minutes even on machines with limited computing power. UCell can be applied to any single-cell data matrix, and includes functions to directly interact with Seurat objects. The UCell package and documentation are available on GitHub at https://github.com/carmonalab/UCell.

Keywords: Cell type; Gene set enrichment; Gene signature; Module scoring; Single-cell.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Evaluating T cell signatures using UCell. A) UMAP representation of T subsets from the single-cell dataset by Hao et al. . B) UCell score distribution in UMAP space for five gene signatures (listed in Table 1) evaluated using UCell. C-D) Comparison of UCell score (C) and Seurat’s AddModuleScore (D) distributions for a two-gene CD8 T cell signature (CD8A, CD8B), evaluated on the complete T cell dataset (black outlines), or on the subset of CD8 T cells only (red outlines); UCell scores for CD8 T cell have the same distribution in the complete or subset dataset, while AddModuleScores are highly dependent on dataset composition. E-F) Running time (E) and peak memory (F) for UCell and AUCell (which produces similar results) on datasets of different sizes show that UCell is about three times faster and requires up to ten times less memory on large (>104) single-cell datasets. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

References

    1. Liberzon A., Birger C., Thorvaldsdóttir H., Ghandi M., Mesirov J.P., Tamayo P. The Molecular Signatures Database (MSigDB) hallmark gene set collection. CellSyst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004. - DOI - PMC - PubMed
    1. Regev A, Teichmann SA, Lander ES, Amit I, Benoist C, Birney E, et al. The Human Cell Atlas. ELife 2017;6. https://doi.org/10.7554/eLife.27041. - PMC - PubMed
    1. Han X., Zhou Z., Fei L., Sun H., Wang R., Chen Y. Construction of a human cell landscape at single-cell level. Nature. 2020;581:303–309. doi: 10.1038/s41586-020-2157-4. - DOI - PubMed
    1. Stuart T., Butler A., Hoffman P., Hafemeister C., Papalexi E., Mauck W.M. Comprehensive integration of single-cell data. Cell. 2019;177:1888–1902.e21. doi: 10.1016/j.cell.2019.05.031. - DOI - PMC - PubMed
    1. Tirosh I., Izar B., Prakadan S.M., Wadsworth M.H., Treacy D., Trombetta J.J. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science. 2016;352:189–196. doi: 10.1126/science:aad0501. - DOI - PMC - PubMed