Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 18;25(1):379.
doi: 10.1186/s12859-024-06003-1.

Spall: accurate and robust unveiling cellular landscapes from spatially resolved transcriptomics data using a decomposition network

Affiliations

Spall: accurate and robust unveiling cellular landscapes from spatially resolved transcriptomics data using a decomposition network

Zhongning Jiang et al. BMC Bioinformatics. .

Abstract

Recent developments in spatially resolved transcriptomics (SRT) enable the characterization of spatial structures for different tissues. Many decomposition methods have been proposed to depict the cellular distribution within tissues. However, existing computational methods struggle to balance spatial continuity in cell distribution with the preservation of cell-specific characteristics. To address this, we propose Spall, a novel decomposition network that integrates scRNA-seq data with SRT data to accurately infer cell type proportions. Spall introduced the GATv2 module, featuring a flexible dynamic attention mechanism to capture relationships between spots. This improves the identification of cellular distribution patterns in spatial analysis. Additionally, Spall incorporates skip connections to address the loss of cell-specific information, thereby enhancing the prediction capability for rare cell types. Experimental results show that Spall outperforms the state-of-the-art methods in reconstructing cell distribution patterns on multiple datasets. Notably, Spall reveals tumor heterogeneity in human pancreatic ductal adenocarcinoma samples and delineates complex tissue structures, such as the laminar organization of the mouse cerebral cortex and the mouse cerebellum. These findings highlight the ability of Spall to provide reliable low-dimensional embeddings for downstream analyses, offering new opportunities for deciphering tissue structures.

Keywords: Cell type proportion; Decomposition; Graph neural network; Spatially resolved transcriptomics; Tissue structure.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
The workflow of Spall. (1) Data preprocessing: generation of pseudo spots from reference scRNA-seq data and feature selection both on original SRT and scRNA-seq data. (2) Integration and graph construction: data integration followed by graph construction using either KNN or Random Projection Forest, depending on dataset scale. (3) Graph neural network training: implementation of two GATv2 modules and one skip connection module. (4) Downstream analysis: application of decomposition results estimated by Spall to various analytical tasks
Fig. 2
Fig. 2
Results of different methods using synthetic datasets. A The performance across two spatial patterns with jump transition generated using scRNA-seq. In the violin plots: the center line represents the median; box limits denote the upper and lower quartiles; whiskers extend to 1.5 × the interquartile range. In the bar plot: bar height represents the mean value, with whiskers indicating the mean ± 95% confidence intervals; The experiments were repeated for 10 times. B Results on STARmap-based in silico dataset. Bar plots display the mean values, with whiskers indicating the mean ± 95% confidence intervals. (p-values indicate the significance level under one-sided t-test for positive correlation. ****p ≤ 0.0001, ***p ≤ 0.001, **p ≤ 0.01, *p ≤ 0.05, ⋅p ≤ 0.1, ns: p > 0.1)
Fig. 3
Fig. 3
Analysis results of Spall on PDAC dataset. A Manual annotation of the PDAC dataset. B Pie plots showing the cell type composition for each spot using Spall. C Top: Proportions of specific cell types as predicted by Spall. Bottom: Spatial distribution of the corresponding marker genes for these cell types. D Spatial region clustering results based on cell type proportions inferred by Spall. E ARI and NMI scores comparing the region annotations with clustering results obtained by Spall and 10 other algorithms. F Comparative analysis of the abundance of four cell types in cancerous regions (n = 137 spots) versus non-cancerous regions (n = 289 spots); the center line represents the median, box limits show the upper and lower quartiles, and whiskers indicate 1.5 × the interquartile range. G TP53 related gene regulatory network analysis in cancerous regions
Fig. 4
Fig. 4
Analysis of mouse cerebral cortex tissue using Spall. A Left: Image of the mouse brain tissue slice. Right: Pie plots representing the cell type composition at each spot, decomposed by Spall. B Spatial distribution of six layer-specific neuron populations predicted by Spall. C Comparison of spatial domain identification based on cell type proportions inferred by Spall versus direct use of gene expression profiles. D Left: Differential gene expression analysis of domains identified by Spall. Right: Top 5 differentially expressed genes in domain 14
Fig. 5
Fig. 5
The results of Spall for mouse cerebellum. A Pie plots showing the cell type composition at each spot, deconvolved by Spall from cerebellar tissue data. B Spatial domain identification based on cell type proportions estimated by Spall. C Top: spatial proportions of four layer-specific cell types predicted by Spall. Bottom: Corresponding spatial domains identified using proportions predicted by Spall. D Cell type abundance analysis across the four spatial domains. The center line represents the median, box limits indicate the upper and lower quartiles, and whiskers correspond to 1.5 × the interquartile range
Fig. 6
Fig. 6
Analysis of the large-scale MOB dataset generated from Stereo-seq using Spall. A Regional annotation of the MOB dataset. B Spatial domains identified based on cell type proportions estimated by Spall. C. Stacked violin plot showing differential gene expression across the spatial domains identified by Spall. D Top: Spatial proportions of four layer-specific cell types predicted by Spall. Middle: Spatial expression of corresponding cell-type-specific marker genes. Bottom: The four corresponding spatial domains identified from decomposition results inferred by Spall. E Cell type abundance analysis in domain 9. The center line represents the median, box limits show the upper and lower quartiles, and whiskers indicate 1.5 × the interquartile range

Similar articles

References

    1. Zhang W, Xu H, Qiao R, et al. ARIC: accurate and robust inference of cell type proportions from bulk gene expression or DNA methylation data. Brief Bioinform. 2022;23(1):bbab362. - PubMed
    1. Cable DM, Murray E, Zou LS, et al. Robust decomposition of cell type mixtures in spatial transcriptomics. Nat Biotechnol. 2022;40:517–26. - PMC - PubMed
    1. Zhang D, Yu N, Li W, et al. stMMR: Accurate and robust spatial domain identification from spatially resolved transcriptomics with multi-modal feature representation. GigaScience. 2024;24:241. - PMC - PubMed
    1. Chen KH, Boettiger AN, Moffitt JR, et al. Spatially resolved, highly multiplexed RNA profiling in single cells. Science. 2015;348:aaa6090. - PMC - PubMed
    1. Lubeck E, Coskun AF, Zhiyentayev T, et al. Single-cell in situ RNA profiling by sequential hybridization. Nat Methods. 2014;11:360–1. - PMC - PubMed

MeSH terms

LinkOut - more resources