This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2023 Apr 18:2023.04.16.536509.

doi: 10.1101/2023.04.16.536509.

Single-cell DNA Methylome and 3D Multi-omic Atlas of the Adult Mouse Brain

Hanqing Liu¹, Qiurui Zeng^{1

2}, Jingtian Zhou^{1

3}, Anna Bartlett¹, Bang-An Wang¹, Peter Berube^{1

2}, Wei Tian¹, Mia Kenworthy¹, Jordan Altshul¹, Joseph R Nery¹, Huaming Chen¹, Rosa G Castanon¹, Songpeng Zu⁴, Yang Eric Li⁴, Jacinta Lucero⁵, Julia K Osteen⁵, Antonio Pinto-Duarte⁵, Jasper Lee⁵, Jon Rink⁵, Silvia Cho⁵, Nora Emerson⁵, Michael Nunn¹, Carolyn O'Connor⁶, Zizhen Yao⁷, Kimberly A Smith⁷, Bosiljka Tasic⁷, Hongkui Zeng⁷, Chongyuan Luo⁸, Jesse R Dixon⁹, Bing Ren^{4

10

11}, M Margarita Behrens⁵, Joseph R Ecker^{1

12}

Affiliations

¹ Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA.
² Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA.
³ Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA.
⁴ Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA.
⁵ Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA.
⁶ Flow Cytometry Core Facility, The Salk Institute for Biological Studies, La Jolla, CA, USA.
⁷ Allen Institute for Brain Science, Seattle, WA, USA.
⁸ Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA.
⁹ Peptide Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA.
¹⁰ Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA.
¹¹ Institute of Genomic Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA.
¹² Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA.

PMID: 37131654
PMCID: PMC10153407
DOI: 10.1101/2023.04.16.536509

Single-cell DNA Methylome and 3D Multi-omic Atlas of the Adult Mouse Brain

Hanqing Liu et al. bioRxiv. 2023.

[Preprint]. 2023 Apr 18:2023.04.16.536509.

doi: 10.1101/2023.04.16.536509.

Authors

Affiliations

¹ Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA.
² Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA.
³ Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA.
⁴ Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA.
⁵ Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA.
⁶ Flow Cytometry Core Facility, The Salk Institute for Biological Studies, La Jolla, CA, USA.
⁷ Allen Institute for Brain Science, Seattle, WA, USA.
⁸ Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA.
⁹ Peptide Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA.
¹⁰ Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA.
¹¹ Institute of Genomic Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA.
¹² Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA.

PMID: 37131654
PMCID: PMC10153407
DOI: 10.1101/2023.04.16.536509

Update in

Single-cell DNA methylome and 3D multi-omic atlas of the adult mouse brain.
Liu H, Zeng Q, Zhou J, Bartlett A, Wang BA, Berube P, Tian W, Kenworthy M, Altshul J, Nery JR, Chen H, Castanon RG, Zu S, Li YE, Lucero J, Osteen JK, Pinto-Duarte A, Lee J, Rink J, Cho S, Emerson N, Nunn M, O'Connor C, Wu Z, Stoica I, Yao Z, Smith KA, Tasic B, Luo C, Dixon JR, Zeng H, Ren B, Behrens MM, Ecker JR. Liu H, et al. Nature. 2023 Dec;624(7991):366-377. doi: 10.1038/s41586-023-06805-y. Epub 2023 Dec 13. Nature. 2023. PMID: 38092913 Free PMC article.

Abstract

Cytosine DNA methylation is essential in brain development and has been implicated in various neurological disorders. A comprehensive understanding of DNA methylation diversity across the entire brain in the context of the brain's 3D spatial organization is essential for building a complete molecular atlas of brain cell types and understanding their gene regulatory landscapes. To this end, we employed optimized single-nucleus methylome (snmC-seq3) and multi-omic (snm3C-seq¹) sequencing technologies to generate 301,626 methylomes and 176,003 chromatin conformation/methylome joint profiles from 117 dissected regions throughout the adult mouse brain. Using iterative clustering and integrating with companion whole-brain transcriptome and chromatin accessibility datasets, we constructed a methylation-based cell type taxonomy that contains 4,673 cell groups and 261 cross-modality-annotated subclasses. We identified millions of differentially methylated regions (DMRs) across the genome, representing potential gene regulation elements. Notably, we observed spatial cytosine methylation patterns on both genes and regulatory elements in cell types within and across brain regions. Brain-wide multiplexed error-robust fluorescence in situ hybridization (MERFISH²) data validated the association of this spatial epigenetic diversity with transcription and allowed the mapping of the DNA methylation and topology information into anatomical structures more precisely than our dissections. Furthermore, multi-scale chromatin conformation diversities occur in important neuronal genes, highly associated with DNA methylation and transcription changes. Brain-wide cell type comparison allowed us to build a regulatory model for each gene, linking transcription factors, DMRs, chromatin contacts, and downstream genes to establish regulatory networks. Finally, intragenic DNA methylation and chromatin conformation patterns predicted alternative gene isoform expression observed in a companion whole-brain SMART-seq³ dataset. Our study establishes the first brain-wide, single-cell resolution DNA methylome and 3D multi-omic atlas, providing an unparalleled resource for comprehending the mouse brain's cellular-spatial and regulatory genome diversity.

PubMed Disclaimer

Conflict of interest statement

Competing Interests J.R.E serves on the scientific advisory board of Zymo Research Inc. B.R. is a shareholder of Arima Genomics Inc., and Epigenome Technologies, Inc. H.Z. is on the scientific advisory board of MapLight Therapeutics, Inc

Figures

**Extended Data Figure 1 |. Brain dissection regions.**
Schematic of brain dissection steps. Each male C57BL/6 mouse brain (age P56) was dissected into 600-μm slices for snmC-seq3 **(a)** and 1,200-μm slices for snm3C-seq3 **(b)**. We then dissected brain regions from both hemispheres within a specific slice.

**Extended Data Figure 2 |. Quality Control for snmC and snm3C dataset.**
**a-b,** The number of input reads and final pass QC reads in snmC-seq3 and snm3C-seq shown by t-SNE **(a)** and violin plot **(b) c,** The percentage of chrom100k bins or genes detected per cell in snmC-seq3 and snm3C-seq. Gray lines from top to bottom indicate the 75%, 50%, and 25% quantiles. **d-e,** The number and ratio of cis-long and trans contacts in snm3C-seq, depicted by t-SNE **(d)** and violin plot **(e)**. f, Heatmap of PCC between the average methylome profiles (mean mCH and mCG fraction of all chromosome 100-kb bins across all cells belonging to a replicate sample). The violin plot below summarizes the values between replicates within the same brain region or between different brain regions. **g-h,** Pairwise overlap score (measuring co-clustering of two replicates) of neuronal subtypes and **(g)** non-neuronal subtypes **(h)**. The violin plots summarize the subtype overlap score between replicates within the same brain region or between different brain regions. i, Distribution of the mCG, mCH, mCCC, and Lambda DNA fraction (non-conversion rate) at sample level in snmC-seq3 and snm3C-seq. j, Pre-clustering t-SNE of snmC and snm3C dataset colored by final mC reads and plate-normalized cell coverage. Arrows indicate typical low-quality clusters filtered out from the further analysis.

**Extended Data Figure 3 |. Metadata of snmC-seq and snm3C-seq dataset.**
**a-c**, t-SNE of snmC-seq color by cell subclass **(a)**, major regions **(b)**, and dissection regions **(c)**. **d-f**, t-SNE of snm3C-seq color by cell subclass **(d)**, major regions **(e)**, and dissection regions **(f)**. **g,h,** Cell-level t-SNE of snmC-seq and snm3C-seq color by global mCG **(g)** and global mCH **(h)** fraction. i, The average global mCG and mCH fractions for neurons in different dissection regions. Regions are ordered by the global mCH fractions, and only the top and bottom 20 regions are shown. j, The average global mCG and mCH fractions for all cell subclasses. Subclasses are ordered by the global mCH level, and only the top and bottom 20 subclasses are shown.

**Extended Data Figure 4 |. t-SNE embedding by major regions.**
This figure groups cells by major regions (first five rows), including isocortex (CTX), olfactory bulb (OLF), amygdala (AMY), cerebral nuclei (CNU), hippocampus (HPF), thalamus (TH), hypothalamus (HY), midbrain (MB), hindbrain (HB), and cerebellum (CB). Each section comprises three columns. The left column displays the CCF-registered 3D brain dissection regions and the corresponding cell on the whole brain t-SNE. The middle and right columns show the t-SNE embedded by cells from this major region, colored by cell subclasses and dissection regions, respectively. The numbers on the t-SNE plot indicate the cell subclass ID, which refers to in Supplementary Table 4. The final row groups non-neuron cells into two sections based on telencephalon and non-telencephalon dissection regions.

**Extended Data Figure 5 |. Example genes illustrating high-granularity correspondence between methylome and transcriptome.**
a, Schematic representation of the normalized gene body mCH fraction (left panel) and RNA CPM value (right panel) at the cell-group-centroids t-SNE plot for each gene. **b-d,** Example gene groups: neurotransmitter-related genes **(b)**, immediate early genes **(c),** and neuropeptide genes **(d)**.

**Extended Data Figure 6 |. Integration of snATAC-seq and snmC-seq3 data.**
a, Barplot displays the alignment scores of each dissection region calculated the low dimensional space of snATAC-seq and snmC-seq integration. b, t-SNE shows the co-embedding of snmC-seq and snATAC-seq data, grouped by major regions and colored by dissection regions. **c-d,** Heatmap visualization of 15 × 15 small heatmaps. Each small heatmap represents the mCG fractions (green) and the corresponding accessibility level of 1,000 cell-type-specific CG-DMRs. Cell subclasses from isocortex **(c)** and midbrain **(d)** are shown as examples.

**Extended Data Figure 7 |. MERFISH data processing and annotation.**
a, Workflow illustrating the generation of MERFISH data, including sample preparation, imaging, and data analysis steps. b, Quality control assessment for each MERFISH sample, where the red lines represent the filtering cutoff for various quality metrics, including RNA total counts, RNA feature counts, blank gene number, cell volume (μm³), and RNA counts per volume. c, Integration t-SNE plot of MERFISH and scRNA dataset color by cell subclasses. d, MERFISH cells colored by cell subclasses, with labels obtained from the integration with the RNA dataset. From top to bottom, the cells are displayed by glutamatergic neurons, other neurons, and non-neurons. e, Spatial epigenetic patterns of *Negr1* and its associated DMRs. Brain slices in the left column are color-coded by normalized gene body mCH fraction, mCG fraction of the DMR (chr3:154,927,600–154,929,099), and RNA expression. The right column displays the normalized contacts heatmap between the DMR and gene.

**Extended Data Figure 8 |. Distribution of snmC-seq cells subclasses on MERFISH slices.**
MERFISH plot depicting the spatial distribution of snmC-seq cells colored by cell subclass on imputed MERFISH locations (Methods). Each row represents a different MERFISH slice. The left column shows glutamatergic neurons and the right column shows other neurons. Centroids of each cell subclass are indicated by arrows, with the numbers indicating their cell proportion on that slice.

**Extended Data Figure 9 |. Chromatin conformation analysis at compartment and domain level.**
a, PCC between compartment score and mCG (orange)/mCH (blue) fractions of all 100kb bins on each chromosome (left panel) or whole genome (right panel). The dot lines inside each violin plot are 75%, 50%, and 25% quantiles from top to bottom. **b-c,** chromosome 1-D heatmaps show PCC between compartment score and mCG fraction **(b)** and the compartment score STD across cell subclasses **(c)** for each chromosome at a 100-Kb resolution. Arrows indicate the location of the *Celf2* gene used as an example in Fig. 4a, b. d, The line plot (mean±s.d.) shows the developmental gene expression level among subtypes defined in La Manno et al. across embryonic days. The genes in each subpanel are selected by overlapping with top negatively correlated (left), positively correlated (right), or uncorrelated (middle) chrom100k bins in **(a)**. e, Workflow for gene body domain boundary analysis. f, The scatter plots of the most negatively (top) or positively (bottom) correlated boundary to each long gene transcript. Both the x and y axis is the PCC between 25Kb bin boundary probability and transcript body mCH (x-axis) or mCG (y-axis) fractions. g, The scatterplot shows the location of each long gene transcript’s most negatively (top) or positively (bottom) correlated boundary. The y-axis is the PCC between the 25Kb bin boundary probabilities and transcript body mCH fractions; the x-axis is the relative genome location to the transcripts. h, Functional enrichment for genes associated with negatively correlated domain boundaries (upper) or positively correlated boundaries (lower).

**Extended Data Figure 10 |. Correlation between gene expression and chromatin contacts.**
a, Workflow for highly variable and gene correlated interaction analysis. b, The distribution of the distance between the furthest correlated interaction and gene TSS. Q95 and Q99 stand for the quantile of all interactions ordered by the distance to TSS.c, Distribution of the number of highly variable and correlated interactions per gene; top 30 gene names are listed. d, Scatterplot shows each gene’s number of correlated interactions (y-axis) and TSS boundary probability correlation (x-axis, PCC between mCH and TSS boundary probability, from Extended Data Fig. 9e). **e-j,** Compound heatmaps display the chromatin conformation landscape of megabase-long genes, including *Ptprd* **(e)**, *Nrxn3* **(f)**, *Lsamp* **(g)**, *Dlg2* **(h)**, *Celf2* **(i),** and *Sox5* **(j)**. For each panel, green rectangles indicate the location of the gene body, the lower triangle shows the F statistics from ANOVA analysis analyzing the variance of contact strength across all cell subclasses (similar to Fig. 4i), and the upper triangle shows the PCC between contact strength and mCH fraction (similar to Fig. 4j).

**Extended Data Figure 11 |. Construction of TF-DMRs-Target regulatory networks.**
a, Scatterplot shows the motif enrichment scores in negatively correlated DMRs (x-axis) and positively correlated DMRs (y-axis) for each TF. The top TFs with the highest motif enrichment scores are listed. Blue contours are the kernel density of the dots. **b-c,** Example TFs with motifs enriched in positively correlated DMRs or negatively correlated DMRs are shown in more detail (similar to Fig. 5f). The *Onecut2* and *Rfx1* gene **(b)** are examples of having motifs enriched in positively correlated DMRs, the Foxp2 and Foxa1 gene **(c)** are examples of having motif enriched in negatively correlated DMRs. d, The top histogram shows the distribution of the number of DMRs each motif is enriched in. The bottom histogram shows the distribution of the number of motif occurrences each DMR has. e, The TF-DMR-Target triples are separated into eight categories (columns) based on their PCC sign between Gene-DMR, TF-DMR, and TF-Gene. The top barplot is the triple distribution in each category. The middle violin plot is the triple final score distribution within each category. Lines inside the violin plot are 25%, 50%, and 75% quantiles, respectively. The bottom dots show the correlation sign combination of each category. Column colors match the schematic in **(f)**. f, The schematic displays the potential regulatory model for the four most common (based on e) TF-DMR-Target triple categories.

**Extended Data Figure 12 |. TF-DMR-Gene triple predict TF and gene relationships.**
**a-f,** Example TF-DMR-Target triple, including 1: Erf (TF), Nab2 (target) and DMR (Chr10:127,595,357–127,595,787) **(a-b)**; 2: Egr1 (TF), Synpo (target) and DMR (Chr18:60,762,310–60,763,534) **(c-d)**; 3: Cacna2d2 (TF), Stat5b (target) and DMR (Chr9:107,462,798–107,463,968) **(e-f)**; For each example, left are t-SNE plot colored by the mCH fraction (blue) or RNA level (purple) for target and TF; mCG fraction (green) and chromatin accessibility (orange) for DMR; and gene-DMR contact score (red) **(a,c,e)**. The compound heatmaps on the right show the chromatin landscape of target genes, including *Nab2* **(b)**, *Synpo* **(d),** and *Cacna2d2* **(f);** the layout is similar to Exnteded Data Fig. 10e–j. g, The dot plots represent TF’s normalized PageRank Score and RNA expression for cell subclasses in the hindbrain (MB). Red dots are colored and sized by PageRank Score. Purple dots are colored by RNA CPM, sized by the percentage of cells in that subclass expressing this gene. Right, the t-SNE plot of snmC-seq cells from MB colored by dissection region and the CCF-registered 3D brain dissection regions.

**Extended Data Figure 13 |. Epigenetic heterogeneity and gene exon usage.**
a, Compound heatmaps illustrate the similarity between the *Oxr1* intragenic methylation heterogeneity and alternative isoform expression patterns. Rows are neuron cell subclasses. I, mCG fraction of all 1,797 CpG sites of *Oxr1* gene with columns ordered by original genome coordinates (bottom colors are CpG clusters from heatmap ll). ll, mCG fraction of CpG sites re-ordered by their CpG clusters (bottom colors) based on subclasses methylation pattern. Heatmap lll and Heatmap lV show the TPM of 11 highly variable transcripts and PSI of 24 highly variable exons of *Oxr1*, quantified with the SMART-seq dataset. All values are z-score normalized across cell subclasses. The *Oxr1* transcript structures and exon locations are indicated at the bottom plots. Heatmap V shows the *Oxr1* gene log(CPM) in scRNA-seq (10X) data. b, Scatterplot shows the PCC between predicted PSI and true PSI for each highly-variable exon (dot), using methylation features (left) and chromatin contact interactions (right) to predict. c, Scatterplot shows the delta PCC in mC models (x-axis) and m3C models (y-axis) for highly-variable exons (dot). Top exons with large delta PCC are listed by their corresponding gene names.

**Figure 1 |. Single-cell DNA methylome and multi-omic atlas chart the cellular and genomic diversity of the whole mouse brain.**
a, The workflow of dissection, nuclei, and library preparation for snmC-seq3 and snm3C-seq. b, The 117 dissection regions from eighteen 600-μm coronal slices are grouped into ten major brain regions (see Supplementary Table 9 for abbreviations). Each dissection region is registered to the 3D common coordinate framework (CCF). c, The cell atlas: methylome-based iterative clustering on snmC and snm3C datasets. The left t-SNE plot is colored by modality; the middle plot is aggregated into 4,673 cell group centroids and colored by 261 cell subclasses; The right part demonstrates cross-modality integration of brain-wide datasets from BICCN, details in Figure 2. d, The genome atlas: the *Tle4* gene exemplifies pseudo-bulk profiles of five modalities across the whole brain, with genome browser view of the “L6 CT CTX Glut” and “Pvalb Gaba” subclasses in the bottom.

**Figure 2 |. Consensus cell type taxonomy across molecular modalities.**
a, Cell-group-centroids t-SNE color by cell subclass (see Extended Data Fig. 3 for number legends’ abbreviations). b, Cell-level t-SNE color by 117 dissection regions. c, 3D CCF registration and cell t-SNE of each major region. d, Cell subclass (upper row) and neurotransmitter composition (bottom row) of each brain dissection region (each upper dot), grouped by major region. e, Integration t-SNE of all neurons from the snmC-seq, snm3C-seq, snATAC-seq, and scRNA-seq datasets, colored by matched cell subclasses. f, Brain-wide cluster map between the snmC-seq and scRNA-seq datasets (Supplementary Table 4) based on iterative integration. Each dot, colored by subclasses, on the diagonal represents a link between the mC clusters (x-axis) and RNA clusters (y-axis). Two examples in floating panels demonstrate highly granular correspondence of cell clusters in the final integration round: Box 1 presents the integration t-SNE colored by intra-modality clusters and confusion matrix of overlap score between “MB-MY Glut-Sero” clusters; Box 2 displays the same information for “L5 ET CTX Glut” clusters. See Extended Data Fig. 5 for more gene details. g, Dot plots of mCG fraction (left) and chromatin accessibility (right) of cell-type-specific CG-DMRs (columns) in each cell subclass (row). The size and color of each dot represent an aggregated epigenetic profile of 1,000 DMRs in a cell subclass; larger dot size and deeper color indicate these DMRs are more hypo-methylated or accessible in a subclass. See Extended Data Fig. 6 for more mC-ATAC integration details.

**Figure 3 |. Coherent spatial epigenomic and transcriptomic diversity in the brain.**
**a-c,** Spatial methylation patterns of DMGs and DMRs across three brain axes (anterior to posterior **(a)**, dorsal to ventral **(b),** medial to lateral **(c)**. d, Workflow of mC-MERFISH integration and spatial embedding of methylome cells. e, Spatially mapped methylation cell atlas. The first row displays CCF-registered brain dissection regions. The second and third rows show imputed spatial locations for glutamatergic and other neurons colored by dissection regions. f, Spatial distribution of cell subclasses for glutamatergic neurons and other neurons on slice 10. g, Spatial epigenetic pattern of neuronal genes and their associated DMRs. The *Elval2* gene represents spatial pattern among subcortical regions; the left column shows gene body mCH fraction, DMR (chr13:91,164,342–91,165,792) mCG fraction, and RNA expression. The right column displays the normalized contacts heatmap between the DMR and gene. h, The *Rasgrf2* gene and associated DMR (chr13:92,027,775–92,028,983) exhibit cortical layer differences in the same layout as **(g)**.

**Figure 4 |. Highly dynamic chromatin conformation features correlate with DNA methylation around neuronal genes.**
This figure displays chromatin conformation diversity at three levels: chromatin compartments **(a-d)**, gene body domains **(e-h),** and highly variable contacts **(i-m)**. a, Top heatmaps are the Pearson-correlation matrices of chr2. Middle plots show the compartment score across chr2 (red and blue indicate A and B compartments, respectively); the bottom row shows the zoom-in view of the *Celf2* gene locus. Three columns from left to right are “L2/3 IT CTX Glut” (C1), “Oligo NN” (C2), and (C1 - C2) delta values. b, Cell-group-centroids t-SNE colored by compartment score and mCG fraction. c, Scatterplot of chrom100k bins, showing PCC between compartment score and chrom100k mCG fraction (x-axis) and compartment score standard deviation (STD) across cell subclasses (y-axis). The blue contours indicate the dots’ kernel density. d, Functional enrichment for genes intersected with negatively correlated chrom100kb bins (boxed in c). e, Top heatmaps are normalized chromatin contact matrices around the *Lingo2* gene from “L2/3 IT CTX Glut” (C1) and “MSN D2 Gaba” (C3). The bottom genome tracks are the corresponding pseudo-bulk ATAC and methylome profiles. f, t-SNE colored by the *Lingo2* TSS boundary probability and *Lingo2* mCH fraction. g, Average boundary probabilities of 25kb bins around long and short genes. h, The scatterplot shows the location of each long gene transcript’s most negatively correlated boundary. The y-axis is the PCC between the 25Kb bin boundary probabilities and transcript body mCH fractions; the x-axis is the relative genome location to the transcripts. **i,j,** Heatmap of F statistics from one-way ANOVA analysis measuring the variance of contact strength across cell subclasses **(i)** and PCC between the *Lingo2* mCH fraction and highly variable interactions’ contact strengths around the *Lingo2* gene **(j)**. The white circles are two loop-like highly variable interactions. Arrows point to strips between interactions and gene bodies. k, t-SNE colored by normalized contact strengths for interactions 1 and 2 in **(j)**. l, Pileup view of the relative genome location of correlated interactions from all genes. The colors in the upper triangle are average PCCs. Location categories include intragenic (I), upstream (U), downstream (D), upstream-intragenic (U-I), downstream-intragenic (D-I), and upstream-downstream (U-D). m, Heatmap showing chromatin landscape of megabase-long genes, green rectangles indicate the location of gene body, the lower triangle is F statistics similar to **(i)**, and the upper triangle is PCC values similar to **(j)**.

**Figure 5 |. Gene regulatory networks predict binding elements, downstream targets, and cell-type importance of transcription factors.**
a, Schematic depicting the three components of the GRN with two density plots display the PCC between the gene’s mCH fractions and RNA expressions (right top) and the PCC between DMR’s mCG fractions and chromatin accessibilities (right bottom). b, the density plot shows the PCC between each DMR’s mCG fractions and the target gene’s mCH fractions. Gray represents the null distribution; shallow blue represents all correlations; blue represents correlations between DMRs overlapping with the target gene’s correlated interaction anchors. c, The scatter plot displays the DMR location and PCC between DMR mCG and gene mCH. Each gray dot represents a DMR-target edge. The blue line represents the moving quantile of PCC. d, Schematic of the DMR-Target edge for *Psd2* (top row) and *Celf2* (bottom row). From left to right, the t-SNE plot is colored by gene mCH fraction, gene-DMR contacts, and DMR mCG fraction. e, the density plot shows PCC between the mCH fraction of TF and the target gene. f, Top, PCC between *Nfia* mCH fraction and DMRs mCG fraction. Bottom, cisTarget motif enrichment score in 50 DMR groups ordered and grouped by the Nfia-DMR PCC value above. The example t-SNE plots are colored by the *Nfia* mCH fraction and mCG fraction of a positively correlated DMR. g, Schematic shows the TF-DMR-Target triple and the final score. h, Distribution of all triples’ final scores (from g) in the final network. Histograms show the number of triples that each TF, gene, and DMR is involved in. i, An example triple of *Egr1* (TF), *Nab2* (target), and DMR. t-SNE plot color by the gene’s mCH fraction or RNA level; DMR’s mCG fraction, chromatin accessibility; and gene-DMR contact score. j, Left, schematic shows the calculation for PageRange score (methods). Right, dot plots represent TF’s normalized PageRank Score and RNA expression for cell subclasses in the hindbrain (HB). Red dots are colored and sized by PageRank Score. Purple dots are colored by RNA CPM, sized by the percentage of cells in that subclass expressing this gene. k, Left, schematic of RFX family sub-networks. Right, t-SNE plot color by normalized PageRank Score (top) and cell subclasses where normalized PageRank score > 0.

**Figure 6 |. Epigenetic heterogeneity predicts gene isoform diversity.**
a, Workflow for the integrative analysis between epigenome and transcriptome datasets. b, Compound heatmaps illustrate the similarity between the *Nrxn3* intragenic methylation heterogeneity and alternative isoform expression patterns. Rows are neuron cell subclasses. I, mCG fraction of all 6,138 CpG sites of *Nrxn3* gene with columns ordered by original genome coordinates (bottom colors are CpG clusters from heatmap ll). ll, mCG fraction of CpG sites re-ordered by their CpG clusters (bottom colors) based on subclasses methylation pattern. Heatmap lll and Heatmap lV show the TPM of 14 highly variable transcripts and PSI of 38 highly variable exons of *Nrxn3*, quantified with the SMART-seq dataset. All values are z-score normalized across cell subclasses. The *Nrxn3* transcript structures and exon locations are indicated at the bottom plots. Red arrows point to beta-Nrxn3 transcripts and one associated CpG cluster. Heatmap V shows the *Nrxn3* gene log(CPM) in scRNA-seq (10X) data. c, Schematic illustrates the process for constructing the prediction model. d, Scatterplot shows the PCC between predicted TPM and true TPM for each highly-variable transcript (dot), using methylation features (left) and chromatin contact interactions (right) to predict. e, Scatterplot shows the delta PCC in mC models (x-axis) and m3C models (y-axis) for highly-variable transcripts (dot). Top transcripts with large delta PCC are listed by their corresponding gene names. f. Genome browser view of intragenic epigenetic and isoform diversity of the *Nrxn3* gene in five cell subclasses (rows). The middle heatmaps are normalized contact strengths of the *Nrxn3* gene locus, with arrows pointing to strips over the beta-*Nrxn3* transcript body. The zoom-in panels show alpha-*Nrxn3*’s (left) and beta-*Nrxn3*’s (right) TSS region, with mCG fraction (green), mCH fraction (blue), and SMART RNA (bottom) expression tracks. g, Similar to f, showing the corresponding intragenic epigenetic and isoform diversity in the *Oxr1* gene.

See this image and copyright information in PMC

References

1. Lee D.-S. et al. Simultaneous profiling of 3D genome structure and DNA methylation in single human cells. Nat. Methods 16, 999–1006 (2019). - PMC - PubMed
1. Chen K. H., Boettiger A. N., Moffitt J. R., Wang S. & Zhuang X. RNA imaging. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015). - PMC - PubMed
1. Picelli S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013). - PubMed
1. Wang Q. et al. The Allen Mouse Brain Common Coordinate Framework: A 3D Reference Atlas. Cell 181, 936–953.e20 (2020). - PMC - PubMed
1. Yao Z. et al. A transcriptomic and epigenomic cell atlas of the mouse primary motor cortex. Nature 598, 103–110 (2021). - PMC - PubMed

Publication types

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

Single-cell DNA Methylome and 3D Multi-omic Atlas of the Adult Mouse Brain

Affiliations

Single-cell DNA Methylome and 3D Multi-omic Atlas of the Adult Mouse Brain

Authors

Affiliations

Update in

Abstract

Conflict of interest statement

Figures

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases