Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 30;39(39 Suppl 1):i131-i139.
doi: 10.1093/bioinformatics/btad242.

SpatialSort: a Bayesian model for clustering and cell population annotation of spatial proteomics data

Affiliations

SpatialSort: a Bayesian model for clustering and cell population annotation of spatial proteomics data

Eric Lee et al. Bioinformatics. .

Abstract

Motivation: Recent advances in spatial proteomics technologies have enabled the profiling of dozens of proteins in thousands of single cells in situ. This has created the opportunity to move beyond quantifying the composition of cell types in tissue, and instead probe the spatial relationships between cells. However, most current methods for clustering data from these assays only consider the expression values of cells and ignore the spatial context. Furthermore, existing approaches do not account for prior information about the expected cell populations in a sample.

Results: To address these shortcomings, we developed SpatialSort, a spatially aware Bayesian clustering approach that allows for the incorporation of prior biological knowledge. Our method is able to account for the affinities of cells of different types to neighbour in space, and by incorporating prior information about expected cell populations, it is able to simultaneously improve clustering accuracy and perform automated annotation of clusters. Using synthetic and real data, we show that by using spatial and prior information SpatialSort improves clustering accuracy. We also demonstrate how SpatialSort can perform label transfer between spatial and nonspatial modalities through the analysis of a real world diffuse large B-cell lymphoma dataset.

Availability and implementation: Source code is available on Github at: https://github.com/Roth-Lab/SpatialSort.

PubMed Disclaimer

Conflict of interest statement

C.H. and A.G. are employees of Bristol Myers Squibb. The other authors declare that they have no competing interests.

Figures

Figure 1.
Figure 1.
(a) Schematic overview, (b) probabilistic graphical model, and (c) prior distributions of SpatialSort. SpatialSort requires expression, cell location, and neighbour relation data as inputs. For each patient, a neighbour graph modelled by a MRF is built to represent the spatial context. Using both expression and spatial data for inference, SpatialSort jointly infers cluster assignment and the interaction parameter of the HMRF to probabilistically assign each cell to a given cell type cluster. When an expectation of certain cell types or a collection of labelled data is present, a prior expression matrix or an anchor expression matrix can be incorporated to improve clustering or perform label transfer.
Figure 2.
Figure 2.
Comparison of performance on model fitting on forward simulated datasets for (a) biased and (b) uniform datasets. Methods applied are shown on the x-axis. 0p indicates the Potts model, 1p and Kp are different parameterisations of the Potts model, GMM indicates the Gaussian mixture model. Performance is shown on the y-axis scored by metrics of homogeneity, completeness, and V-measure which are colour coded according to the legend.
Figure 3.
Figure 3.
Comparison of performance on model fitting on spatial Gaussian mixture datasets for (a) biased and (b) uniform datasets. The average overlaps of spatial Gaussian mixture datasets are shown on the x-axis. Scores of performances using the V-measure metric are shown on the y-axis. Each dataset with different average overlap was fit by three different methods: SpatialSort, GMM, and Phenograph.
Figure 4.
Figure 4.
Performance on semireal spatial CyTOF data. (a) A spatial neighbour graph of a singular sample in the biased dataset. Nodes indicate a single cell colour-coded by cluster assignment. Cells tend to engage in autonomous interactions spatially. (b) Boxplot of V-measure scores to show clustering accuracy of various methods fitting on 100 semireal biased datasets. (c) and (d) are examples of the uniform dataset as a comparison. Uniform interaction terms render cells to have a random chance of neighbouring any type of cell.
Figure 5.
Figure 5.
Cell type annotation of DLBCL MIBI data using SpatialSort. (a) The cell type distribution bar graph of the clustering results from using the 0p, 1p, and Kp model. Counts are log-scaled. (b) Spatial distribution of the expression of lymphocyte lineage markers, PAX5 and CD3 in sample P7683. Colour represents normalized intensity of expression. (c) Sample P7683 plotted by spatial coordinates. Cells are colour-coded by cell type assignment inferred by the 0p, 1p, and Kp models in anchor mode. The red box in b and c highlights one area of significant difference in cell assignment. (d) Sample-specific expression heatmaps for sample P7683. Rows are colour-coded by cell type shown in (c). T-cells are highlighted by red boxes to illustrate mixed PAX5 expression.

References

    1. Angelo M, Bendall SC, Finck R. et al. Multiplexed ion beam imaging of human breast tumors. Nat Med 2014;20:436–42. - PMC - PubMed
    1. Ali HR, Jackson HW, Zanotelli VRT. et al. ; CRUK IMAXT Grand Challenge Team. Imaging mass cytometry and multiplatform genomics define the phenogenomic landscape of breast cancer. Nat Cancer 2020;1:163–75. - PubMed
    1. Azizi E, Carr AJ, Plitas G. et al. Single-cell map of diverse immune phenotypes in the breast tumor microenvironment. Cell 2018;174:1293–308.e36. - PMC - PubMed
    1. Bendall SC, Simonds EF, Qiu P. et al. Single-cell mass cytometry of differential immune and drug responses across a human hematopoietic continuum. Science 2011;332:687–96. - PMC - PubMed
    1. Bishop CM. Pattern Recognition and Machine Learning. Berlin, Heidelberg: Springer-Verlag; 2006.

Publication types