Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 18;25(1):99.
doi: 10.1186/s13059-024-03241-7.

Library size confounds biology in spatial transcriptomics data

Affiliations

Library size confounds biology in spatial transcriptomics data

Dharmesh D Bhuva et al. Genome Biol. .

Abstract

Spatial molecular data has transformed the study of disease microenvironments, though, larger datasets pose an analytics challenge prompting the direct adoption of single-cell RNA-sequencing tools including normalization methods. Here, we demonstrate that library size is associated with tissue structure and that normalizing these effects out using commonly applied scRNA-seq normalization methods will negatively affect spatial domain identification. Spatial data should not be specifically corrected for library size prior to analysis, and algorithms designed for scRNA-seq data should be adopted with caution.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Detection density and total detections/library sizes are associated with biology consistently across different spatial molecular technologies, organs, and species. a–d Detection density per bin/spot plot for Visium dorsolateral prefrontal cortex (DLPFC), Xenium mouse brain, STOmics mouse brain, and CosMx non-small cell lung cancer (NSCLC) reveal tissue structure. e–h Regions annotated for each bin/spot using the Allen Brain Atlas for the mouse brain and manual annotation based on immunofluorescence markers of CosMx NSCLC. i–l Number of cells plot against the total detections/library sizes per bin/spot, colored by the tissue region, showing the region-specific relationship between cells and total detections/library sizes. m–p Average total detections/library sizes per cell for each region, computed as the sum of detections divided by the number of cells for each region, showing that related regions exhibit similar total detections/library sizes per cell. As the mouse brain datasets have over 100 regions annotated, color schemes from the Allen Brain Atlas are used where only larger structures are colored. (Note: truncated outlier marked by x)
Fig. 2
Fig. 2
Normalization of total detections/library sizes results in poorer spatial domain identification using clustering approaches. a Schematic of the benchmark performed on 25 samples spanning four spatial transcriptomics technologies showing the parameter space explored when using a single-cell clustering pipeline, as well as two spatially aware methods to identify spatial domains. b The adjusted Rand index (ARI) obtained when different normalization strategies are applied on the different datasets using three different clustering methods: graph-based clustering, SpaGCN, and BayesSpace. Explicit library size normalization using sctransform results in poorer domain identification across most datasets, indicating that library size confounds biology in spatial transcriptomics datasets. Choice of normalization methods is dependent on the clustering algorithm and dataset type

References

    1. Marx V. Method of the year: spatially resolved transcriptomics. Nat Methods. 2021;18:9–14. doi: 10.1038/s41592-020-01033-y. - DOI - PubMed
    1. Chen A, Liao S, Cheng M, Ma K, Wu L, Lai Y, Qiu X, Yang J, Xu J, Hao S, et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays. Cell. 2022;185(1777–1792):e1721. - PubMed
    1. He S, Bhatt R, Brown C, Brown EA, Buhr DL, Chantranuvatana K, Danaher P, Dunaway D, Garrison RG, Geiss G, et al. High-plex imaging of RNA and proteins at subcellular resolution in fixed tissue by spatial molecular imaging. Nat Biotechnol. 2022;40(12):1794–1806. doi: 10.1038/s41587-022-01483-z. - DOI - PubMed
    1. Janesick A, Shelansky R, Gottscho AD, Wagner F, Williams SR, Rouault M, Beliakoff G, Morrison CA, Oliveira MF, Sicherman JT, Kohlway A, Abousoud J, Drennon TY, Mohabbat SH, 10x Development Teams, Taylor SEB. High resolution mapping of the tumor microenvironment using integrated single-cell, spatial and in situ analysis. Nat Commun. 2023;14(1):8353. - PMC - PubMed
    1. Moses L, Pachter L. Museum of spatial transcriptomics. Nat Methods. 2022;19:534–546. doi: 10.1038/s41592-022-01409-2. - DOI - PubMed

Publication types

LinkOut - more resources