Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 20;53(6):gkaf251.
doi: 10.1093/nar/gkaf251.

CellPie: a scalable spatial transcriptomics factor discovery method via joint non-negative matrix factorization

Affiliations

CellPie: a scalable spatial transcriptomics factor discovery method via joint non-negative matrix factorization

Sokratia Georgaka et al. Nucleic Acids Res. .

Abstract

Spatially resolved transcriptomics has enabled the study of expression of genes within tissues while retaining their spatial identity. Most spatial transcriptomics (ST) technologies generate a matched histopathological image as part of the standard pipeline, providing morphological information that can complement the transcriptomics data. Here, we present CellPie, a fast, unsupervised factor discovery method based on joint non-negative matrix factorization of spatial RNA transcripts and histological image features. CellPie employs the accelerated hierarchical least squares method to significantly reduce the computational time, enabling efficient application to high-dimensional ST datasets. We assessed CellPie on three different human cancer types with different spatial resolutions, including a highly resolved Visium HD dataset, demonstrating both good performance and high computational efficiency compared to existing methods.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Graphical Abstract
Graphical Abstract
Figure 1.
Figure 1.
Graphical overview of the CellPie method. CellPie takes as input spatial gene expression counts (spots by genes) and paired morphological image features extracted from H&E images (spots by features). These two modalities are jointly factorized using CellPie , resulting in three matrices: a shared spots by factors matrix, containing the reduced parts-based representation (factors), and two individual matrices containing the weights of each of the features, a gene loading and an image loading matrix.
Figure 2.
Figure 2.
Validation of CellPie on human prostate cancer data. (A) H&E image of the invasive prostate carcinoma tissue. (B) Pathologist’s annotations of the tissue. (C) Model selection for CellPie for a range of factors. (D) Clustering performance of CellPie for a range of modality weights. The plot shows three measures [Fowlkes–Mallows, adjusted Rand index (ARI), and adjusted mutual Info] of clustering similarity between CellPie and the pathologist’s annotations. (E) CellPie’s selected factors that represent the pathologist’s annotated regions. (FCellPie clusters computed using the Leiden algorithm on CellPie’s output factors. (G) Contingency table between CellPie’s clusters and pathologist’s annotations.
Figure 3.
Figure 3.
Comparison of CellPie against other published dimensionality reduction methods: (A) pathologist’s annotations and clustering results of CellPie, NSF, NSFH, MEFISTO, FA, PNMF,and CellPie with only gene expression. To cluster the factors of each method, the Leiden clustering algorithm with six clusters was used. (B) ARI between the pathologist’s annotations and the clusters of the methods. (C) Gene ontology of Factors 5 and 20, using CellPie’s top 150 marker genes associated with each of those factors. (D) Highest ranked cell types per factor, computed using a published single-cell RNA-seq dataset and Scanpy’s scoreformula imagegenes function. (E) Running time (left) and maximum memory usage (right) for all the methods.
Figure 4.
Figure 4.
(A) H&E image of the HER2-positive breast cancer sample (patient H1) and pathologist’s annotations. (BCellPie’s selected factors. (C) Contingency matrix between CellPie’s factors and pathologist’s annotations. (D) Gene ontology of Factors 1, 7, and 9.
Figure 5.
Figure 5.
(A) Pathologist’s annotations and clustering results of CellPie, MEFISTO, PNMF, FA, NSF, and NSFH factors. To cluster the factors of each method, the k-means clustering algorithm with six clusters was used. (B) ARI between the resulting clusters and the ground truth. (C) Spatial distribution of the tertiary lymphoid structure (TLS) score, computed using the results in [22]. (D) Pearson correlation between CellPie’s, MEFISTO’s, PNMF’s, NSF’s, NSFH’s, and FA’s factors and the TLS scores. (E) Running time (left) and maximum memory usage (right) for all the methods.
Figure 6.
Figure 6.
(A) Top: CellPie’s Factor 34 and spatial gene expression of REG1A gene, marker of SELENOP+ macrophages. Bottom: CellPie’s Factor 11 and spatial gene expression of TGFBI gene, a marker of SPP1+ macrophages. (B) Pearson correlation between CellPie’s Factors 34 and 11 and the REG1A and TGFBI genes, respectively, across a range of weights. (C) Pathways enriched in Factors 34 and 11, respectively, where the MSigDB Hallmark 2020 database was used. (D) Running time (top) and maximum memory usage (bottom) for NSF, NSFH, PNMF, CellPie, FA, and sklearn-NMF.

References

    1. Zhuang X Spatially resolved single-cell genomics and transcriptomics by imaging. Nat Methods. 2021; 18:18–22.10.1038/s41592-020-01037-8. - DOI - PMC - PubMed
    1. Li X, Wang CY From bulk, single-cell to spatial RNA sequencing. Int J Oral Sci. 2021; 13:36.10.1038/s41368-021-00146-0. - DOI - PMC - PubMed
    1. Schier AF Single-cell biology: beyond the sum of its parts. Nat Methods. 2020; 17:17–20.10.1038/s41592-019-0693-3. - DOI - PubMed
    1. Ståhl PL, Salmén F, Vickovic S et al. . Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science. 2016; 353:78–82.10.1126/science.aaf2403. - DOI - PubMed
    1. Rodriques SG, Stickels RR, Goeva A et al. . Slide-seq: a scalable technology for measuring genome-wide expression at high spatial resolution. Science. 2019; 363:1463–67.10.1126/science.aaw1219. - DOI - PMC - PubMed