Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 10:11:giac075.
doi: 10.1093/gigascience/giac075.

Stardust: improving spatial transcriptomics data analysis through space-aware modularity optimization-based clustering

Affiliations

Stardust: improving spatial transcriptomics data analysis through space-aware modularity optimization-based clustering

Simone Avesani et al. Gigascience. .

Abstract

Background: Spatial transcriptomics (ST) combines stained tissue images with spatially resolved high-throughput RNA sequencing. The spatial transcriptomic analysis includes challenging tasks like clustering, where a partition among data points (spots) is defined by means of a similarity measure. Improving clustering results is a key factor as clustering affects subsequent downstream analysis. State-of-the-art approaches group data by taking into account transcriptional similarity and some by exploiting spatial information as well. However, it is not yet clear how much the spatial information combined with transcriptomics improves the clustering result.

Results: We propose a new clustering method, Stardust, that easily exploits the combination of space and transcriptomic information in the clustering procedure through a manual or fully automatic tuning of algorithm parameters. Moreover, a parameter-free version of the method is also provided where the spatial contribution depends dynamically on the expression distances distribution in the space. We evaluated the proposed methods results by analyzing ST data sets available on the 10x Genomics website and comparing clustering performances with state-of-the-art approaches by measuring the spots' stability in the clusters and their biological coherence. Stability is defined by the tendency of each point to remain clustered with the same neighbors when perturbations are applied.

Conclusions: Stardust is an easy-to-use methodology allowing to define how much spatial information should influence clustering on different tissues and achieving more stable results than state-of-the-art approaches.

Keywords: clustering; spatial transcriptomics analysis; stability scores, parameters tuning, software comparison.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Hematoxylin and eosin (H&E) stained tissue sections of human breast cancer section 1 (HBC1), human breast cancer section 2 (HBC2), mouse kidney (MK), human heart (HH), and human lymph node (HLN).
Figure 2:
Figure 2:
Stardust performance on five ST data sets: two sections of human breast cancer (HBC1 and HBC2), mouse kidney (MK), human heart (HH), and human lymph node (HLN). (A) Stardust coefficient of variation for each configuration obtained varying the space weight and clustering resolution. Space weight and resolution tuned by maximizing the average cell stability score are shown with violet dots. (B) Stability score comparison for 5 Stardust configurations with increasing space weight and cluster resolution fixed to 0.8. (C) The count of spots shifting from stable to unstable and vice versa at stability thresholds equal to 0.25, 0.5, and 0.75, which set the limit to consider a spot stable (above the threshold) or unstable (below the threshold), comparing the best configuration of Stardust (i.e., with the lowest coefficient of variation) with the one not using space information.
Figure 3:
Figure 3:
Stardust* space version performances with respect to no space version ones evaluated on 5 ST data sets: 2 serial stages of human breast cancer (HBC1 and HBC2), mouse kidney (MK), human heart (HH), and human lymph node (HLN). (A) Coefficient of variation values comparison for 3 Stardust* space and no space configurations obtained by varying the clustering resolution. (B) Stability scores comparison for 3 Stardust* space and no space configurations obtained by varying the clustering resolution. (C) The Stardust* count of spots shifting from stable to unstable and vice versa considering clustering with no space information as baseline at different clustering resolutions equal to 0.6, 0.8, and 1, which set the limit to consider a spot stable (above the threshold) or unstable (below the threshold).
Figure 4:
Figure 4:
Cluster biological coherence. (A) Manual pathologists' annotation of human breast cancer 1 data set provided by Lewis et al. [2] and clustering achieved with the best configuration of Stardust* with resolution 0.6. (B) Spatial plots showing the expression level of three of the top 100 genes with highest Moran index for the HBC1 Visium data set.
Figure 5:
Figure 5:
Comparison of Stardust, Stardust*, and state-of-the-art tools on the HBC1 data set. (A) The coefficient of variation values derived from the stability score distribution of each tool configuration. The cluster resolution refers to the resolution parameter for the Louvain community detection algorithm; image usage tells whether the image is included in the clustering method. (B) The cell stability score distributions of the best-performing configuration of each tool (i.e., the one with the lowest coefficient of variation). (C) The hematoxylin and eosin (H&E) stained tissue sample and a spatial plot for each best tool configuration with clusters of spots on the tissue section. (D) The stability score shifts obtained comparing the best configuration of each tool with the base Stardust no space version (i.e., the one not considering space).

Similar articles

Cited by

References

    1. Buettner F, Natarajan KN, Casale FP, et al. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat Biotechnol. 2015;33(2):155–60. - PubMed
    1. Lewis SM, Asselin-Labat ML, Nguyen Q, et al. Spatial omics and multiplexed imaging to explore cancer biology. Nat Methods. 2021;18(9):1–16. - PubMed
    1. Ståhl PL, Salmén F, Vickovic S, et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science. 2016;353(6294):78–82. - PubMed
    1. Asp M, Bergenstråhle J, Lundeberg J. Spatially resolved transcriptomes—next generation tools for tissue exploration. Bioessays. 2020;42(10):1900221. - PubMed
    1. Marx V. Method of the year: spatially resolved transcriptomics. Nat Methods. 2021;18(1):9–14. - PubMed