Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2019 Jun 13;177(7):1873-1887.e17.
doi: 10.1016/j.cell.2019.05.006. Epub 2019 Jun 6.

Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity

Affiliations
Comparative Study

Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity

Joshua D Welch et al. Cell. .

Abstract

Defining cell types requires integrating diverse single-cell measurements from multiple experiments and biological contexts. To flexibly model single-cell datasets, we developed LIGER, an algorithm that delineates shared and dataset-specific features of cell identity. We applied it to four diverse and challenging analyses of human and mouse brain cells. First, we defined region-specific and sexually dimorphic gene expression in the mouse bed nucleus of the stria terminalis. Second, we analyzed expression in the human substantia nigra, comparing cell states in specific donors and relating cell types to those in the mouse. Third, we integrated in situ and single-cell expression data to spatially locate fine subtypes of cells present in the mouse frontal cortex. Finally, we jointly defined mouse cortical cell types using single-cell RNA-seq and DNA methylation profiles, revealing putative mechanisms of cell-type-specific epigenomic regulation. Integrative analyses using LIGER promise to accelerate investigations of cell-type definition, gene regulation, and disease states.

Keywords: bed nucleus of the stria terminalis; data integration; single-cell genomics; substantia nigra.

PubMed Disclaimer

Conflict of interest statement

Declaration of Interests

The authors have no financial interests to declare.

Figures

Figure 1:
Figure 1:. LIGER approach to integration of highly heterogeneous single cell datasets.
(a) LIGER takes as input two or more datasets, which may come from different individuals, species, or modalities, that share corresponding gene-level features. (b) Integrative nonnegative matrix factorization (Yang and Michailidis, 2016) identifies shared and dataset-specific metagenes across datasets. (c) Building a graph in the resulting factor space, based on comparing neighborhoods of maximum factor loadings (Methods). Each cell is numbered by its maximum factor loading and connected to its nearest neighbors within each dataset. The shared factor neighborhood graph leverages the factor loading values of neighboring cells to prevent the spurious integration of divergent cell types across datasets (such as the yellow cells shown). See also Figure S1.
Figure 2:
Figure 2:. Benchmarking LIGER performance.
(a)-(b) t-SNE visualizations of Seurat/CCA (Butler et al., 2018) (a) and LIGER (b) analyses of two scRNA-seq datasets prepared from human blood cells. (c) Alignment metrics for the Seurat and LIGER analyses of the human blood cell datasets, human and mouse pancreas datasets, and hippocampal interneuron/oligodendrocyte datasets. Error bars on the LIGER data points represent 95% confidence intervals across 20 random iNMF initializations. (d)-(e) t-SNE visualizations of Seurat/CCA (d) and LIGER (e) analyses of 3,212 hippocampal interneurons and 2,524 oligodendrocytes. Note the small shared population of doublets in the middle of the t-SNE, highlighting LIGER’s ability to identify rare populations. (f) Agreement metrics for Seurat and LIGER analyses of the datasets listed in (c). (g)-(h) Alignment and agreement for varying proportions of oligodendrocytes mixed with a fixed number of interneurons. (i) Riverplot comparing the previously published clustering results for each blood cell dataset with the LIGER joint clustering assignments. See also Figure S2.
Figure 3:
Figure 3:. LIGER reveals region-specific and sex-specific cellular specialization in the bed nucleus of the stria terminalis.
(a) t-SNE visualization of 76,693 bed nucleus neurons analyzed by LIGER, colored by cluster, and labeled by an exclusive marker. (b) Top, feature plots showing expression of Sh3d21 and Vipr2, in the LIGER BNST analysis. Bottom, sagittal ABA images of Sh3d21 and Vipr2, showing restricted expression to the BNST oval nucleus. (c) t-SNE visualization of a LIGER analysis of 352 BNST nuclei in clusters BNST_Vip and BNSTp_Cplx3, and 1,060 CGE-derived cortical interneurons (Saunders et al.,2018), are colored by dataset (top), and LIGER cluster (bottom). (d) Dot plot showing the relative expression of genes, by dataset, in clusters 1 and 2 of the analysis shown in (c). Each dataset is scaled separately to reconcile differences in sampling between whole cells and nuclei (Methods). (e) Sagittal ABA images of Vip, Lamp5, and Id2 expression; arrows highlight signal present in the BNST. (f) t-SNE visualization of a LIGER analysis of 8,624 nuclei, drawn from three clusters positive for the SPN marker Ppp1r1b (Figure S3), and 10,934 striatal SPNs (Saunders et al., 2018). The striatal SPNs are colored according to their published clustering into three major transcriptional categories (direct, indirect, and eccentric). (g) Dot plots showing expression of canonical SPN genes in the clusters defined in (f). Markers include those of iSPN identity (Adora2a), dSPN identity (Drd1) and the recently described eSPN identity (Tshz1, Otof, and Cacng5), as well as two markers of the BNST-specific cluster 4 (Cdc14a and Hcn1). (h) Coronal ABA image of Cdc14a, showing exclusive expression in the rhomboid nucleus of the anterolateral BNST. (i) Bar plot quantifying dimorphically expressed genes per BNST neuron cluster, identified by a bootstrap analysis (STAR Methods). Note that the two BNSTpr clusters (BNSTpr_St18 and BNSTpr_Esr2) show high numbers of dimorphic genes. (j) Cell factor loading values (top) and gene loading plots (bottom) of top loading dataset-specific and shared genes (bottom) for factor 27, which loads primarily on one of the BNSTpr clusters. (k) Genes ranked by degree of dimorphism (STAR Methods); positive values indicate increased expression in males, while negative values indicate increased female expression. Positions of previously validated dimorphic genes and X/Y chromosome genes are indicated in blue and red, respectively. The two novel genes shown in panel (l) are indicated in purple. (l) Feature plots showing expression patterns of known (Greb1 and Esr1) and novel (Elt4 and Acvr1c) dimorphic genes across BNST neurons. Abbreviations in ISH images: ac, anterior commissure; act, anterior commissure, temporal limb; ov, oval nucleus; st, stria terminalis; CP, caudate putamen. See also Figure S3 and Data S1.
Figure 4:
Figure 4:. LIGER allows analysis of substantia nigra (SN) across individuals and species.
(a)-(b) UMAP plots of a LIGER analysis of 44,274 nuclei derived from the SN of 7 human donors, colored by donor (a) and major cell class (b). (c) Violin plots showing expression of marker genes across the 25 human SN populations identified by two rounds of LIGER analysis. (d)-(f) UMAP plots showing cell factor loading values (top) and gene loading plots (bottom) for factors corresponding to an acutely activated polydendrocyte state (d), an activated microglia state (e) and a reactive astrocyte state (f). In gene loading plots, gene names are sorted in decreasing order of magnitude of their factor loading contribution, and correspond to colored points in scatter plots. Plots are organized to show the metagene specific to tissue donors MD5828 and MD5840, and the shared metagene common to all datasets. Genes mentioned in the text are boxed. (g) GO terms enriched in homologous genes with strong expression correlation across SN clusters in the LIGER comparative analysis of human and mouse. (h) GO terms enriched in homologous genes with weak expression correlation. Colors indicate false discovery rate, while size of the circles indicates the number of genes associated with each GO term in (g) and (h). Terms mentioned in the text are highlighted in bold. See also Figure S4.
Figure 5:
Figure 5:. Locating cortical cell types in space using scRNAseq and STARmap.
(a)-(b) t-SNE plots of a LIGER analysis of 71,000 frontal cortex scRNAseq profiles prepared by Drop-seq (Saunders et al., 2018) and 2,500 cells profiled by STARmap (Wang et al., 2018) colored by technology (a) and LIGER cluster assignment (b). Labels in (b) derive from the published annotations of the Drop-seq dataset. (c) Dot plot showing marker expression for STARmap cells (top line of each gene) and Drop-seq cells (bottom line) across LIGER joint clusters. (d) Spatial locations of STARmap cells colored by LIGER cluster assignments.(e) Density plot showing proportion of cells in which each gene is detected for the Drop-seq (red) and STARmap (blue) datasets. (f)-(h) t-SNE plots and spatial locations for LIGER subclustering analyses of interneurons (f), pyramidal neurons (g) and glia (h). (i) Violin plots of marker genes for two astrocyte populations identified in subclustering analysis of glia. (j) Spatial coordinates for Gfap-expressing astrocyte populations (two STARmap replicates shown). (k) Gfap staining data from the Allen Brain Atlas showing localization of Gfap to both meninges and white matter layer below cortex. See also Figure S5.
Figure 6:
Figure 6:. Defining cortical cell types using both scRNAseq and DNA methylation.
(a)-(b) t-SNE visualization of LIGER analysis of Drop-seq data (Saunders et al., 2018) and methylation data (Luo et al., 2017) from mouse frontal cortex, colored by modality (a) and LIGER cluster assignment (b). (c) Riverplot showing relationship between published cluster assignments of RNA and methylation data and LIGER joint clusters. (d) Expression and methylation of two claustrum markers. (e) t-SNE representation of the LIGER subcluster analysis of MGE interneurons. (f) Expression and methylation of 4 marker genes for different MGE subpopulations. (g) Boxplots of expression and methylation markers for Sst-Chodl cells (cluster MGE_12). See also Figure S6.
Figure 7:
Figure 7:. Investigating the connection between DNA methylation and gene expression.
(a) Density plot of the correlation between gene body methylation and expression for short, medium, and long genes. (b) Violin plots showing the wide range of global methylation levels across neural cell types defined by the LIGER analysis in Figure 6. (c)-(e) Scatter plots of global methylation and aggregate expression for (c) Mecp2, (d) Tet3, and (e) Dnmt3a across our joint neural cell clusters. (f) Network of predicted interactions between transcription factors (TFs; red) and differentially methylated regions (DMRs; blue). An edge connects a TF to a DMR if the region contains a binding motif for the TF and has methylation anticorrelated with the expression of the transcription factor (Pearson ρ < −0.45). The network segregates into two largely disconnected components, which are enriched for TFs expressed in excitatory (left) and inhibitory (right) neurons. (g) Genome browser view showing locations of differentially methylated regions near the Arx locus and their correlation with the expression of Arx. The bars indicate sign and magnitude of the correlation. The 3 bottom panels show zoomed-in views of three clusters of DMRs. See also Figure S7.

Comment in

References

    1. Allen LS, and Gorski RA (1990). Sex difference in the bed nucleus of the stria terminalis of the human brain. The Journal of comparative neurology 302, 697–706. - PubMed
    1. Amir S, Lamont EW, Robinson B, and Stewart J (2004). A circadian rhythm in the expression of PERIOD2 protein reveals a novel SCN-controlled oscillator in the oval nucleus of the bed nucleus of the stria terminalis. The Journal of neuroscience : the official journal of the Society for Neuroscience 24, 781–790. - PMC - PubMed
    1. Baron M, Veres A, Wolock SL, Faust AL, Gaujoux R, Vetere A, Ryu JH, Wagner BK, Shen-Orr SS, Klein AM, et al. (2016). A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure. Cell Syst 3, 346–360 e344. - PMC - PubMed
    1. Bayless DW, and Shah NM (2016). Genetic dissection of neural circuits underlying sexually dimorphic social behaviours. Philos Trans R Soc Lond B Biol Sci 371, 20150109. - PMC - PubMed
    1. Bayraktar G, and Kreutz MR (2018). The Role of Activity-Dependent DNA Demethylation in the Adult Brain and in Neurological Disorders. Front Mol Neurosci 11, 169. - PMC - PubMed

Publication types