Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 2;25(1):82.
doi: 10.1186/s13059-024-03217-7.

Bento: a toolkit for subcellular analysis of spatial transcriptomics data

Affiliations

Bento: a toolkit for subcellular analysis of spatial transcriptomics data

Clarence K Mah et al. Genome Biol. .

Abstract

The spatial organization of molecules in a cell is essential for their functions. While current methods focus on discerning tissue architecture, cell-cell interactions, and spatial expression patterns, they are limited to the multicellular scale. We present Bento, a Python toolkit that takes advantage of single-molecule information to enable spatial analysis at the subcellular scale. Bento ingests molecular coordinates and segmentation boundaries to perform three analyses: defining subcellular domains, annotating localization patterns, and quantifying gene-gene colocalization. We demonstrate MERFISH, seqFISH + , Molecular Cartography, and Xenium datasets. Bento is part of the open-source Scverse ecosystem, enabling integration with other single-cell analysis tools.

PubMed Disclaimer

Conflict of interest statement

G.W.Y. is a co-founder, member of the board of directors, equity holder, and paid consultant for Locanabio (until 12/31/2023) and Eclipse Bioinnovations, and a Scientific Adviser and paid consultant to Jumpcode Genomics. G.W.Y. is a Distinguished Visiting Professor at the National University of Singapore. The terms of these arrangements have been reviewed and approved by the University of California, San Diego, in accordance with its conflict-of-interest policies. The authors declare no other competing interests.

Figures

Fig. 1
Fig. 1
Workflow and functionality of the Bento toolkit. A Single-molecule resolved spatial transcriptomics data from commercial or custom platforms are ingested into Bento where it is converted to the AnnData format (B), where it can be manipulated with Bento as well as a wide ecosystem of single-cell omics tools. C Geometric statistics are illustrated for the seqFISH + dataset, including metrics describing cell and nuclear geometries and cell density to assess overall data quality. D Bento has a standard interface to perform a wide variety of subcellular analyses
Fig. 2
Fig. 2
Subcellular localization pattern identification with RNAforest. A Thirteen spatial summary statistics are computed for every gene-cell pair describing the spatial arrangement of molecules and boundaries in relation to one another. The features (Supp. Table 1) are inputs for RNAforest, a multilabel ensemble classifier that assigns one or more subcellular localization labels: cell edge, cytoplasmic, nuclear, nuclear edge, and none. The colors for each label are used consistently throughout to figure. Top 10 genes for each label visualized for each label other than “none” in B U2-OS cells and C 3T3 cells. D and E are UpSet plots showing the proportion of measured transcripts assigned to each label. F and G show the relative label proportion across cells for each gene and are colored by the majority label (F and G). H Top 5 consistent genes for each label. I ssGEA identifies the enrichment of GO cellular component domains for each label in the 3T3 cell dataset. Stars denote p-values under thresholds defined in the legend. P-values are derived from ssGSEA permutation tests with Benjamini–Hochberg correction controlling for false discovery rate
Fig. 3
Fig. 3
Compartment-specific RNA colocalization with RNAcoloc. A Transcripts are separated by compartment (nucleus and cytoplasm) before CLQ scores are calculated for every gene pair across all cells. This yields a cell × gene pair × compartment tensor. B Pairwise comparison of log CLQ distributions for gene pairs and self-pairs, further categorized by compartment. The Mann–Whitney U test was used for comparisons. Stars denote p-values below the legend threshold with Benjamini–Hochberg correction controlling for false-discovery rate. From top to bottom, group sizes are 12,254,430 (cytoplasm gene pairs), 115,187 (nucleus gene pairs), 6,778,402 (cytoplasm self-pairs), and 86,474 (nucleus self-pairs). C Tensor decomposition yields 4 factors. From left to right, the three heatmaps show the loadings of each factor for each dimension—compartments, cells, and gene pairs. Only the top 5 associated gene pairs for each factor are shown. D Top examples of compartment-specific colocalized gene pairs. Black scale bars denote 10 μm
Fig. 4
Fig. 4
RNAflux finds distinct subcellular domains with consistent spatial organization and local gene composition. A Flowchart of RNAflux and fluxmap computation. Local neighborhoods of a fixed radius are arrayed across a cell and a normalized gene composition is computed for each pixel coordinate, producing an RNAflux embedding. The first three principal components of the RNAflux embedding are visualized for U2-OS cells coloring RGB values by PC1, PC2, and PC3 values respectively for each pixel. Fluxmap domains are computed from each RNAflux embedding to create semantic segmentation masks of each subcellular domain. B The left panel shows a field of view of U2-OS cells, dots denoting individual molecules colored by gene species, nuclei, and cell boundaries outlined in white. For the same field of view of cells, the center panel shows RNAflux embeddings and the right panel shows fluxmap domains. C The scatter plot shows how the composition of each gene is distributed across fluxmap domains. The position of each point denotes the relative bias of a given gene’s composition across fluxmaps. D Heatmap showing the fraction of pixels with a positive enrichment value for each APEX-seq location for each fluxmap domain. EI The most highly enriched location is shown for each fluxmap domain. Domain boundaries are denoted by white lines within each cell
Fig. 5
Fig. 5
Subcellular RNA localization changes upon Doxorubicin treatment in iPSC-derived cardiomyocytes. A Cardiomyocytes derived from human iPSCs were treated with DMSO or 2.5 μM DOX for 12 h. The localizations of 100 genes relevant to cardiomyocyte health and function were measured using Molecular Cartography. Cell boundaries were determined using ClusterMap and nuclei were segmented using Cellpose. B Top 10 differentially upregulated and downregulated genes in vehicle versus treatment. T-test was used for comparisons. All genes shown are significant given an adjusted p-value threshold of p < 0.01. Benjamini–Hochberg correction was used to control for the false discovery rate. Vehicle and treatment conditions have n = 7159 and 6260 cells respectively. C APEX-seq location-specific gene enrichment of fluxmap domains for the cytosol, endoplasmic reticulum membrane (ERM), endoplasmic reticulum lumen (ER Lumen), nuclear lamina, nucleus, nucleolus, nuclear pore, and outer mitochondrial matrix (OMM). D Fluxmap domains visualized for a representative field of view of cardiomyocytes for vehicle and treatment respectively highlighting cellular nuclei, ERM/OMM, ER Lumen, and cytosol. E RNAflux fluxmap enrichment of each gene averaged across vehicle and treatment cardiomyocytes captures changes in subcellular RNA localization. Top 10 genes are labeled and ranked by the largest shifts between compartment compositions. Shifts are quantified by Wasserstein distance. F Average gene enrichment in each fluxmap across vehicle and treatment conditions colored by log-fold expression demonstrates population-level shifts in transcript subcellular localization. G Visualization of RBM20, CACNB2, and LAMP2 transcripts confirms the depletion of transcripts from the perinuclear and cytosolic compartments of cardiomyocytes upon DOX treatment

References

    1. Thul PJ, Åkesson L, Wiking M, Mahdessian D, Geladaki A, Ait Blal H, et al. A subcellular map of the human proteome. Science. 2017;356. Available from: 10.1126/science.aal3321. - PubMed
    1. Laurila K, Vihinen M. Prediction of disease-related mutations affecting protein localization. BMC Genomics. 2009;10:122. - PMC - PubMed
    1. Park S, Yang J-S, Shin Y-E, Park J, Jang SK, Kim S. Protein localization as a principal feature of the etiology and comorbidity of genetic diseases. Mol Syst Biol. 2011;7:494. - PMC - PubMed
    1. Chin A, Lécuyer E. RNA localization: Making its way to the center stage. Biochim Biophys Acta Gen Subj. 2017;1861:2956–70. - PubMed
    1. Bovaird S, Patel D, Padilla J-CA, Lécuyer E. Biological functions, regulatory mechanisms, and disease relevance of RNA localization pathways. FEBS Lett. 2018;592:2948–72. - PubMed

Publication types