Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Oct;31(10):1706-1718.
doi: 10.1101/gr.275224.121.

Advances in spatial transcriptomic data analysis

Affiliations
Review

Advances in spatial transcriptomic data analysis

Ruben Dries et al. Genome Res. 2021 Oct.

Abstract

Spatial transcriptomics is a rapidly growing field that promises to comprehensively characterize tissue organization and architecture at the single-cell or subcellular resolution. Such information provides a solid foundation for mechanistic understanding of many biological processes in both health and disease that cannot be obtained by using traditional technologies. The development of computational methods plays important roles in extracting biological signals from raw data. Various approaches have been developed to overcome technology-specific limitations such as spatial resolution, gene coverage, sensitivity, and technical biases. Downstream analysis tools formulate spatial organization and cell-cell communications as quantifiable properties, and provide algorithms to derive such properties. Integrative pipelines further assemble multiple tools in one package, allowing biologists to conveniently analyze data from beginning to end. In this review, we summarize the state of the art of spatial transcriptomic data analysis methods and pipelines, and discuss how they operate on different technological platforms.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Data sets used in this Perspective.
Figure 2.
Figure 2.
Preprocessing of raw spatial transcriptomic data. (A) For spatial transcriptomics data paired with images, processing begins with correction and stitching of multiple captures or fields of views (FOVs) to form a clear composite image. (B) Images from multiple stacked sections of the same tissue can be registered and the resulting spatial transformations mapped back to the transcriptomic data in order to create an aligned 3D gene expression data set. This is illustrated with the breast cancer spatial transcriptomics data set from Andersson et al. (2020b). (C) Several methods exist to provide expression data with spatial context. For technologies such as FISH and ISS that do not have clearly defined read spots or boundaries, cell segmentation (upper panel) is required in order to assign reads to individual cells. In situ capture or array-based methods, on the other hand (lower panel), assign reads to read spots based on a spatial barcode unique to each spatial unit (e.g., spot).
Figure 3.
Figure 3.
Overview of spatial transcriptomics analysis methods. A variety of analyses can be performed on spatial transcriptomics data. (A) Analysis can be performed on the image itself, ranging from early tasks such as cell segmentation to support of subcellular analysis through cell shape and size classification. (B) Cell types can be identified through clustering and annotation. Additional integration with external scRNA-seq data or deconvolution of spatial units that cover multiple cells (C) can be performed to fine-tune cell type mapping. (D) The spatial distribution of cell types and the underlying cell-to-cell communication (E) can be computed. (F) Spatial expression patterns are identified and visualized based on information of gene expression and spatial coordinates. (G) Data at subcellular resolution can be used to identify spatial and temporal dynamics of transcripts within a single cell.
Figure 4.
Figure 4.
Strategies for cell type identification with spatial transcriptomic data. (A) Spatial transcriptomics data at single-cell resolution can be directly used to identify cell types in an analogous manner to scRNA-seq. In addition, external scRNA-seq from matching tissue can also be integrated to increase the number of available features and aid in the identification of detected cell types. (B) An example of cell type annotation is shown on the MERFISH mouse coronal brain slice data set. Each single dot represents a single cell, and colors indicate different cell types identified through clustering. A zoomed-in subset shows the spatial cell type composition at a higher resolution. (C) Cell types in non-single-cell spatial transcriptomic data are identified through deconvolution approaches that make use of external information or through gene enrichment strategies using sets of known marker genes or scRNA-seq information. (D) Enrichment scores for two cell types within the human heart 10x Genomics Visium data set are overlaid on top of the spots within a region of interest. (E) Pie charts depict the proportion of identified cell types within each selected spot used in D.
Figure 5.
Figure 5.
Spatial pattern analyses. (A) Spatial distribution analysis of neighboring cell types. Network represents the likelihood of two cell types being found in close physical proximity to each other. (B) A subset of cells from the MERFISH mouse coronal brain slice data set shows the spatial network connectivity and cellular proximities between different cell types. (C) At the single-cell level, cellular niches can be identified based on a target cell (yellow) and its direct neighboring cells (blue). The composition and position of the neighboring cell types create a niche for the target cell (bottom). (D) Source and neighboring cells are depicted within a small subset of the MERFISH mouse coronal brain slice data set. (E) Patterns based on spatial gene expression information are based on single or multiple genes and are continuous (top) or discrete (bottom). (F) Individual genes with unique spatial coherent expression patterns in the MERFISH mouse brain coronal data set are shown on the right.
Figure 6.
Figure 6.
Schematic diagram for spatial transcriptomics analysis at subcellular resolution. (A) For spatial data at subcellular resolution, each dot typically represents a single transcript or, alternatively, a spatial unit that is well below the cell size. (B) The location of each transcript, along with its gene identity, can be used as input to try and segment each cell. (C) Individual transcripts can be colocalized with other transcripts (orange and blue) or with itself (green) or can be found at specific subcellular structures (pink at membrane). (D) Transcription dynamics from individual or multiple genes can be inferred from the location of transcripts. Here nascent transcripts are typically found in the nucleus (blue), whereas processed transcripts are found in the cytoplasm (orange). The ratio between the two can provide an estimate for the RNA velocity. Examples for each analysis are provided on the right of each panel using the seqFISH+ data set from the mouse somatosensory cortex.
Figure 7.
Figure 7.
Cellular communication inferred from ligand–receptor interactions. The known ligand–receptor interaction pairs are first explored using their gene expression profiles and then passed to a computational tool to generate communication scores that explain connectivity between and within each cell type as shown in A. A spatial graph can be constructed with these scores between different cell types as shown in B and C.
Figure 8.
Figure 8.
An overview of interactive exploratory analysis pipeline. The integrative and interactive pipeline with several options can be used to analyze the spatial data sets. (A) Spatial data analysis starts with importing and processing raw data sets. The analysis can then be subdivided into image-based analysis (B) and gene expression–based analysis (C). Analysis based on images such as cell segmentation and morphological quantification is available to investigate the cellular intricacies in a selected section of a tissue. Gene expression–based analysis consists of several approaches such as clustering, spatial network construction, and cell type enrichment to visualize gene expression patterns. An interactive graphical interface makes these methods easier accessible for novice users.

References

    1. Abdelaal T, Michielsen L, Cats D, Hoogduin D, Mei H, Reinders MJT, Mahfouz A. 2019. A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol 20: 194. 10.1186/s13059-019-1795-z - DOI - PMC - PubMed
    1. Achim K, Pettit J-B, Saraiva LR, Gavriouchkina D, Larsson T, Arendt D, Marioni JC. 2015. High-throughput spatial mapping of single-cell RNA-seq data to tissue of origin. Nat Biotechnol 33: 503–509. 10.1038/nbt.3209 - DOI - PubMed
    1. Adekunle DA, Wang ET. 2020. Transcriptome-wide organization of subcellular microenvironments revealed by ATLAS-Seq. Nucleic Acids Res 48: 5859–5872. 10.1093/nar/gkaa334 - DOI - PMC - PubMed
    1. Alon S, Goodwin DR, Sinha A, Wassie AT, Chen F, Daugharthy ER, Bando Y, Kajita A, Xue AG, Marrett K, et al. 2021. Expansion sequencing: spatially precise in situ transcriptomics in intact biological systems. Science 371: eaax2656. 10.1126/science.aax2656 - DOI - PMC - PubMed
    1. Amezquita RA, Lun ATL, Becht E, Carey VJ, Carpp LN, Geistlinger L, Marini F, Rue-Albrecht K, Risso D, Soneson C, et al. 2020. Orchestrating single-cell analysis with Bioconductor. Nat Methods 17: 137–145. 10.1038/s41592-019-0654-x - DOI - PMC - PubMed

Publication types

LinkOut - more resources