Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 27;25(3):bbae171.
doi: 10.1093/bib/bbae171.

stDiff: a diffusion model for imputing spatial transcriptomics through single-cell transcriptomics

Affiliations

stDiff: a diffusion model for imputing spatial transcriptomics through single-cell transcriptomics

Kongming Li et al. Brief Bioinform. .

Abstract

Spatial transcriptomics (ST) has become a powerful tool for exploring the spatial organization of gene expression in tissues. Imaging-based methods, though offering superior spatial resolutions at the single-cell level, are limited in either the number of imaged genes or the sensitivity of gene detection. Existing approaches for enhancing ST rely on the similarity between ST cells and reference single-cell RNA sequencing (scRNA-seq) cells. In contrast, we introduce stDiff, which leverages relationships between gene expression abundance in scRNA-seq data to enhance ST. stDiff employs a conditional diffusion model, capturing gene expression abundance relationships in scRNA-seq data through two Markov processes: one introducing noise to transcriptomics data and the other denoising to recover them. The missing portion of ST is predicted by incorporating the original ST data into the denoising process. In our comprehensive performance evaluation across 16 datasets, utilizing multiple clustering and similarity metrics, stDiff stands out for its exceptional ability to preserve topological structures among cells, positioning itself as a robust solution for cell population identification. Moreover, stDiff's enhancement outcomes closely mirror the actual ST data within the batch space. Across diverse spatial expression patterns, our model accurately reconstructs them, delineating distinct spatial boundaries. This highlights stDiff's capability to unify the observed and predicted segments of ST data for subsequent analysis. We anticipate that stDiff, with its innovative approach, will contribute to advancing ST imputation methodologies.

Keywords: diffusion model; imputation; scRNA-seq data; spatial transcriptomics data.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Framework of stDiff. (A) Brief framework of DDPM. The forward diffusion process formula image (left to right) gradually introduces Gaussian noise to the target data. The reverse process formula image (right to left) iteratively denoises the target data. (B) Training process of stDiff. ScRNA-seq data formula image undergoes noise perturbation to get formula image. It is then introduced noise dependent on time step formula image, resulting in formula image. Shared part of formula image and unique part of formula image are concatenated to form formula image. Finally, a denoising network formula image is trained to predict the introduced noise. The training process is guided by the shared gene part of formula image. (C) Inference process of stDiff. ST data serve as condition to guide the learned denoising network formula image to denoise step by step from a random noise. The final result after removing introduced noise is the predicted imputation for ST data.
Figure 2
Figure 2
UMAP plots illustrating scRNA-seq data, real ST data and imputed ST data generated by Tangram, gimVI, stPlus, SpaGE, uniPort, SpatialScope and stDiff. (A) and (B) correspond to Dataset2_osmFISH and Dataset3_ExSeq in Table 1, respectively.
Figure 3
Figure 3
Clustering metrics (ARI, AMI, Homogeneity, NMI) demonstrating the topological consistency among cells between authentic ST data and predicted data generated by Tangram, gimVI, stPlus, SpaGE, uniPort, SpatialScope and stDiff across different platforms of ST data.
Figure 4
Figure 4
Evaluation metrics (1-SPCC, 1-SSIM, RMSE, JS) to assess gene expression similarity between authentic ST data and predicted data generated by Tangram(Tan), gimVI(gim), SpaGE(Spa), stPlus(stP), uniPort(uni), SpatialScope(SpS) and stDiff(stD) across different platforms of ST data. (A)–(D) correspond to Dataset2_osmFISH, Dataset5_MERFISH, Dataset6_MERFISH and Dataset10_seqFISH in Table 1, respectively.
Figure 5
Figure 5
The predicted expression abundance of known spatially patterned genes in Dataset8_FISH. Each column corresponds to a single gene with a clear spatial pattern. The first row from the top displays the ground truth of spatial gene expression in Dataset8_FISH, while the subsequent rows show the corresponding predicted expression patterns through 5-fold cross-validation experiments using stDiff, Tangram, gimVI, SpaGE, stPlus, uniPort and SpatialScope.
Figure 6
Figure 6
Boxplots and scatter plots of the AS for the data generated by the seven methods across all 15 paired datasets. The central line represents the median, the box depicts the interquartile range, whiskers extend to 1.5 times the interquartile range, and dots represent the AS of individual datasets. (A) The AS scores for clustering metrics. (B) the AS scores for gene similarity metrics. Panel (C), The overall AS scores for all eight metrics.

Similar articles

Cited by

References

    1. Moffitt JR, Bambah-Mukku D, Eichhorn SW, et al. .. Molecular, spatial, and functional single-cell profiling of the hypothalamic preoptic region. Science 2018;362. - PMC - PubMed
    1. Codeluppi S, Borm LE, Zeisel A, et al. .. Spatial organization of the somatosensory cortex revealed by osmFISH. Nat Methods 2018;15(11):932–5. - PubMed
    1. Eng C-HL, Lawson M, Zhu Q, et al. .. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH+. Nature 2019;568(7751):235–9. - PMC - PubMed
    1. Rodriques SG, Stickels RR, Goeva A, et al. .. Slide-seq: a scalable technology for measuring genome-wide expression at high spatial resolution. Science 2019;363(6434):1463–7. - PMC - PubMed
    1. Ståhl PL, Salmén F, Vickovic S, et al. .. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 2016;353(6294):78–82. - PubMed