Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 22;25(1):bbad500.
doi: 10.1093/bib/bbad500.

stAA: adversarial graph autoencoder for spatial clustering task of spatially resolved transcriptomics

Affiliations

stAA: adversarial graph autoencoder for spatial clustering task of spatially resolved transcriptomics

Zhaoyu Fang et al. Brief Bioinform. .

Abstract

With the development of spatially resolved transcriptomics technologies, it is now possible to explore the gene expression profiles of single cells while preserving their spatial context. Spatial clustering plays a key role in spatial transcriptome data analysis. In the past 2 years, several graph neural network-based methods have emerged, which significantly improved the accuracy of spatial clustering. However, accurately identifying the boundaries of spatial domains remains a challenging task. In this article, we propose stAA, an adversarial variational graph autoencoder, to identify spatial domain. stAA generates cell embedding by leveraging gene expression and spatial information using graph neural networks and enforces the distribution of cell embeddings to a prior distribution through Wasserstein distance. The adversarial training process can make cell embeddings better capture spatial domain information and more robust. Moreover, stAA incorporates global graph information into cell embeddings using labels generated by pre-clustering. Our experimental results show that stAA outperforms the state-of-the-art methods and achieves better clustering results across different profiling platforms and various resolutions. We also conducted numerous biological analyses and found that stAA can identify fine-grained structures in tissues, recognize different functional subtypes within tumors and accurately identify developmental trajectories.

Keywords: adversarial learning; graph autoencoder; graph neural network; spatial domain; spatial transcriptomics.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of stAA. (A) A variational graph autoencoder model is applied to generate the latent embedding based on gene expression profiles and spatial information. The gene expression data is partitioned into diverse initial clustering according to a traditional method. A classifier enables the latent embedding to capture global graph information derived from the pre-clustering group labels. An adversarial training scheme is used to improve the quality of the latent embedding. The Wasserstein distance replaces a discriminator to differentiate between two probability distributions. (B) The generated latent embeddings are clustered using louvain or mclust. The resulting clustering can be used for various downstream analyses, including trajectory inference and cell-to-cell communication analysis.
Figure 2
Figure 2
Comparison of stAA and other methods on the human DLPFC dataset. (A) The boxplots of average ARI values of seven approaches (SpaGCN, conST, DeepST, CCST, STAGATE, GraphST) for 12 samples. Each method is running 10 times to eliminate the random effects. (B) Ground truth for sample 151674, which has seven domains: six cortex layers and one white matter (WM). Spatial domain identification results for sample 151674 using stAA, SpaGCN, conST, DeepST, CCST, STAGATE and GraphST. (C) Visualization of UMAP and PAGA graphs generated by stAA, DeepST, CCST, STAGATE and GraphST, using their respective embeddings and clustering labels for sample 151674.
Figure 3
Figure 3
A comparison analysis of STAGATE, GraphST and stAA on Slide-seqV2-based mouse hippocampus data. (A) The identified spatial domains of three compared approaches. There are 10 regions in each method. A higher SC score and lower DB index denote a better clustering performance. (B) The laminar structure of mouse hippocampus is provided by the Allen Reference Atlas (Coronal Atlas). (C) Top, the domains clustered by stAA is shown on each spatial location. Bottom, the expression levels of corresponding domain-specific marker genes. The examined domains are CA1 cells, CA3 cells and dentate cells. (D) Bar plots display the mean gene expression of marker genes specific to CA1 cells, CA3 cells and DB cells inferred by comparative methods. Wilcoxon test, ***P < 0.001, ****P < 0.0001. (E) The enriched Allen brain atlas terms in cluster 4 versus cluster 7.
Figure 4
Figure 4
Clustering results of STAGATE, GraphST and stAA on the Slide-seqV2-based mouse olfactory bulb data. (A) The detected spatial regions of three compared techniques. There are 10 clusters in STAGATE and 11 clusters in GraphST and stAA. (B) The Allen Reference Atlas (Coronal Atlas) of mouse olfactory bulb data. (C) The UMAP plot of stAA’s clustering regions. (D) The spatial trajectory inference based on the UMAP plot. The evolution direction of these clusters in stAA is RMS, GCL, IPL, EPL and ONL. This trajectory is consistent with the spatial topological structure. (E) Representations of known marker genes for homologous regions of mouse olfactory bulb data (Pcp4: GCL, Gabra1: MCL, Apod: ONL, Cck: GL, Slc17a7: EPL).
Figure 5
Figure 5
Results of spatial clustering using stAA on human breast cancer data and downstream analysis. (A) The manual annotation of this 10X Visium dataset is provided in the SEDR package. There are 20 segmented areas in this data and they are classified into four types. (B) The identified spatial domains using stAA. (C) Dot plot showing the differential expression genes of cluster 7 and cluster 15. (D) The ARI values of six compared methods. The ARI in conST is the lowest and stAA’s ARI is the highest. CCST is close to GraphST and they are better than DeepST and STAGATE. (E) The enriched GO terms in cluster 7 versus cluster 15. (F) Heatmap showing the enriched hallmark scores in each IDC cluster. (G) Heatmap showing the enriched hallmark scores in each DCIS/LCIS cluster. (H) Cell–cell interactions between subtypes, the link size represents the interaction strength.

References

    1. Zhao S, Zhang L, Liu X. AE-TPGG: a novel autoencoder-based approach for single-cell RNA-seq data imputation and dimensionality reduction. Front Comp Sci 2023;17:173902. - PMC - PubMed
    1. Zhang M, Sheffield T, Zhan X, et al. Spatial molecular profiling: platforms, applications and analysis tools. Brief Bioinform 2021;22:bbaa145. - PMC - PubMed
    1. Cheng A, Hu G, Li WV. Benchmarking cell-type clustering methods for spatially resolved transcriptomics data. Brief Bioinform 2023;24:bbac475. - PMC - PubMed
    1. Eng CL, Lawson M, Zhu Q, et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH. Nature 2019;568:235–9. - PMC - PubMed
    1. Zhang M, Eichhorn SW, Zingg B, et al. Molecular, spatial and projection diversity of neurons in primary motor cortex revealed by in situ single-cell transcriptomics. bioRxiv. 2020. 10.1101/2020.06.04.105700. - DOI

Publication types

LinkOut - more resources