Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Apr 15:arXiv:2504.12353v1.

TransST: Transfer Learning Embedded Spatial Factor Modeling of Spatial Transcriptomics Data

Affiliations

TransST: Transfer Learning Embedded Spatial Factor Modeling of Spatial Transcriptomics Data

Shuo Shuo Liu et al. ArXiv. .

Update in

Abstract

Background: Spatial transcriptomics have emerged as a powerful tool in biomedical research because of its ability to capture both the spatial contexts and abundance of the complete RNA transcript profile in organs of interest. However, limitations of the technology such as the relatively low resolution and comparatively insufficient sequencing depth make it difficult to reliably extract real biological signals from these data. To alleviate this challenge, we propose a novel transfer learning framework, referred to as TransST, to adaptively leverage the cell-labeled information from external sources in inferring cell-level heterogeneity of a target spatial transcriptomics data.

Results: Applications in several real studies as well as a number of simulation settings show that our approach significantly improves existing techniques. For example, in the breast cancer study, TransST successfully identifies five biologically meaningful cell clusters, including the two subgroups of cancer in situ and invasive cancer; in addition, only TransST is able to separate the adipose tissues from the connective issues among all the studied methods.

Conclusions: In summary, the proposed method TransST is both effective and robust in identifying cell subclusters and detecting corresponding driving biomarkers in spatial transcriptomics data.

Keywords: Clustering; Markov random field; Spatial transcriptomics; Transfer learning; factor model.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Overview of TransST. TransST involves three steps. Step 1 learns the weight matrix from the source data; Step 2 adaptively updates the weight matrix of the target data by leveraging information transferred from the source data; Step 3 models the low-dimensional representation for downstream analysis, such as cluster analysis and identification of differentially expressed genes.
Fig. 2
Fig. 2
Comparison of TransST with multiple methods across various simulation settings. Black dots represent the averaged values. A: Comparison of the clustering performances among various methods when β=0.5 (solid line) and β=1 (dotted line), using the true number of clusters; B: Transferring power of TransST with different levels of noise in the target data; C: Heatmap of differentially expressed genes for each cell type identified by TransST with β=1; D: Visualization the distribution of gene expression across different cell clusters by ridge plot; E: Comparison of the clustering performances among various methods that can estimate the number of clusters; F: Absolute error |K^K| of estimating the number of clusters by various methods.
Fig. 3
Fig. 3
Computational time (in seconds) of various methods in different settings.
Fig. 4
Fig. 4
TranST enables accurate spatial mapping of scRNA-seq data in human HER2-positive tumors. A: Comparison of the clustering performance among various methods, using the true number of clusters; B: Comparison of methods that can estimate the number of clusters; C: Heatmap of the top 5 differentially expressed genes for each cell type in Sample H1, identified by TransST. Each cluster is annotated based on its association with morphological regions; D: Spatial heatmaps with estimated labels by various methods for Sample H1. Morphological regions in the Image are annotated by a pathologist into six distinct categories: adipose tissue (cyan), breast glands (green), cancer in situ (orange), connective tissue (blue), immune infiltrate (yellow), and invasive cancer (red).
Fig. 5
Fig. 5
TransST enables accurate identification of brain layers of the DLPFC dataset. A: Comparison of the clustering performance among various methods, using the true number of clusters; B: Comparison of methods that can estimate the number of clusters; C: Spatial heatmaps with estimated labels by various methods for Sample 151669 (K is unknown); D: UMAP plots for TransST with colors and shapes showing the sample IDs.
Fig. 6
Fig. 6
TransST enables accurate identification of different areas in the mouse embryo. A: Spatial heatmaps colored by TransST and the original study (truth), respectively; B: Visualization of selected spatial domains identified by the original study (Row 1) and TransST (Row 2) and visualization of selected spatial domains by marker gene expressions identified by TransST (Row 3); C: Heatmap of differentially expressed genes (top 2) identified by TransST; D: Dot plot for the distributions of selected genes.
Fig. 7
Fig. 7
TransST enables identification of different regions and differentially expressed genes in the squamous cell carcinoma data. A: UMAP for the target sample by the spatial methods SC-MEB, DR.SC, spGMM, and TransST; B: Spatial heatmaps by various clustering methods and the scanned image; C: Heatmap of top 10 differentially expressed genes for each cell type identified by TransST.

References

    1. Yang Y., Shi X., Liu W., Zhou Q., Chan Lau M., Chun Tatt Lim J., Sun L., Ng C.C.Y., Yeong J., Liu J.: Sc-meb: spatial clustering with hidden markov random field using empirical bayes. Briefings in Bioinformatics 23(1), 466 (2022) - PMC - PubMed
    1. Lin Y., Wu T.-Y., Wan S., Yang J.Y., Wong W.H., Wang Y.R.: scjoint integrates atlas-scale single-cell rna-seq and atac-seq data with transfer learning. Nature Biotechnology 40(5), 703–710 (2022) - PMC - PubMed
    1. Lohoff T., Ghazanfar S., Missarova A., Koulena N., Pierson N., Griffiths J., Bardot E., Eng C.-H., Tyser R., Argelaguet R., et al. : Integration of spatial and single-cell transcriptomic data elucidates mouse organogenesis. Nature Biotechnology 40(1), 74–85 (2022) - PMC - PubMed
    1. Ji A.L., Rubin A.J., Thrane K., Jiang S., Reynolds D.L., Meyers R.M., Guo M.G., George B.M., Mollbrink A., BergenstrÅhle J., et al. : Multimodal analysis of composition and spatial architecture in human squamous cell carcinoma. Cell 182(2), 497–514 (2020) - PMC - PubMed
    1. Hu J., Li X., Coleman K., Schroeder A., Ma N., Irwin D.J., Lee E.B., Shinohara R.T., Li M.: Spagcn: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nature Methods 18(11), 1342–1351 (2021) - PubMed

Publication types

LinkOut - more resources