Cytocipher determines significantly different populations of cells in single-cell RNA-seq data
- PMID: 37449901
- PMCID: PMC10368802
- DOI: 10.1093/bioinformatics/btad435
Cytocipher determines significantly different populations of cells in single-cell RNA-seq data
Abstract
Motivation: Identification of cell types using single-cell RNA-seq is revolutionizing the study of multicellular organisms. However, typical single-cell RNA-seq analysis often involves post hoc manual curation to ensure clusters are transcriptionally distinct, which is time-consuming, error-prone, and irreproducible.
Results: To overcome these obstacles, we developed Cytocipher, a bioinformatics method and scverse compatible software package that statistically determines significant clusters. Application of Cytocipher to normal tissue, development, disease, and large-scale atlas data reveals the broad applicability and power of Cytocipher to generate biological insights in numerous contexts. This included the identification of cell types not previously described in the datasets analysed, such as CD8+ T cell subtypes in human peripheral blood mononuclear cells; cell lineage intermediate states during mouse pancreas development; and subpopulations of luminal epithelial cells over-represented in prostate cancer. Cytocipher also scales to large datasets with high-test performance, as shown by application to the Tabula Sapiens Atlas representing >480 000 cells. Cytocipher is a novel and generalizable method that statistically determines transcriptionally distinct and programmatically reproducible clusters from single-cell data.
Availability and implementation: The software version used for this manuscript has been deposited on Zenodo (https://doi.org/10.5281/zenodo.8089546), and is also available via github (https://github.com/BradBalderson/Cytocipher).
© The Author(s) 2023. Published by Oxford University Press.
Conflict of interest statement
None declared.
Figures







Similar articles
-
CASCC: a co-expression-assisted single-cell RNA-seq data clustering method.Bioinformatics. 2024 May 2;40(5):btae283. doi: 10.1093/bioinformatics/btae283. Bioinformatics. 2024. PMID: 38662553 Free PMC article.
-
A machine learning-based method for automatically identifying novel cells in annotating single-cell RNA-seq data.Bioinformatics. 2022 Oct 31;38(21):4885-4892. doi: 10.1093/bioinformatics/btac617. Bioinformatics. 2022. PMID: 36083008 Free PMC article.
-
FlowGrid enables fast clustering of very large single-cell RNA-seq data.Bioinformatics. 2021 Dec 22;38(1):282-283. doi: 10.1093/bioinformatics/btab521. Bioinformatics. 2021. PMID: 34289014
-
Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters.BMC Bioinformatics. 2019 Jul 1;20(1):369. doi: 10.1186/s12859-019-2951-x. BMC Bioinformatics. 2019. PMID: 31262249 Free PMC article.
-
Review of single-cell RNA-seq data clustering for cell-type identification and characterization.RNA. 2023 May;29(5):517-530. doi: 10.1261/rna.078965.121. Epub 2023 Feb 3. RNA. 2023. PMID: 36737104 Free PMC article. Review.
Cited by
-
CHOIR improves significance-based detection of cell types and states from single-cell data.bioRxiv [Preprint]. 2025 Feb 19:2024.01.18.576317. doi: 10.1101/2024.01.18.576317. bioRxiv. 2025. Update in: Nat Genet. 2025 May;57(5):1309-1319. doi: 10.1038/s41588-025-02148-8. PMID: 38328105 Free PMC article. Updated. Preprint.
-
Systematic analysis of the transcriptional landscape of melanoma reveals drug-target expression plasticity.Brief Funct Genomics. 2025 Jan 15;24:elad055. doi: 10.1093/bfgp/elad055. Brief Funct Genomics. 2025. PMID: 38183207 Free PMC article.
-
CHOIR improves significance-based detection of cell types and states from single-cell data.Nat Genet. 2025 May;57(5):1309-1319. doi: 10.1038/s41588-025-02148-8. Epub 2025 Apr 7. Nat Genet. 2025. PMID: 40195561
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials