Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct;598(7879):129-136.
doi: 10.1038/s41586-021-03604-1. Epub 2021 Oct 6.

An atlas of gene regulatory elements in adult mouse cerebrum

Affiliations

An atlas of gene regulatory elements in adult mouse cerebrum

Yang Eric Li et al. Nature. 2021 Oct.

Abstract

The mammalian cerebrum performs high-level sensory perception, motor control and cognitive functions through highly specialized cortical and subcortical structures1. Recent surveys of mouse and human brains with single-cell transcriptomics2-6 and high-throughput imaging technologies7,8 have uncovered hundreds of neural cell types distributed in different brain regions, but the transcriptional regulatory programs that are responsible for the unique identity and function of each cell type remain unknown. Here we probe the accessible chromatin in more than 800,000 individual nuclei from 45 regions that span the adult mouse isocortex, olfactory bulb, hippocampus and cerebral nuclei, and use the resulting data to map the state of 491,818 candidate cis-regulatory DNA elements in 160 distinct cell types. We find high specificity of spatial distribution for not only excitatory neurons, but also most classes of inhibitory neurons and a subset of glial cell types. We characterize the gene regulatory sequences associated with the regional specificity within these cell types. We further link a considerable fraction of the cis-regulatory elements to putative target genes expressed in diverse cerebral cell types and predict transcriptional regulators that are involved in a broad spectrum of molecular and cellular pathways in different neuronal and glial cell populations. Our results provide a foundation for comprehensive analysis of gene regulatory programs of the mammalian brain and assist in the interpretation of noncoding risk variants associated with various neurological diseases and traits in humans.

PubMed Disclaimer

Conflict of interest statement

B.R. is a co-founder and consultant of Arima Genomics Inc. and co-founder of Epigenome Technologies. K.J.G. is a consultant of Genentech and shareholder in Vertex Pharmaceuticals. A. J.R.E is on the scientific advisory board of Zymo Research, Inc.

Figures

Fig. 1
Fig. 1. Single-cell analysis of chromatin accessibility in the adult mouse cerebrum.
a, Schematic of sample dissection strategy. A detailed list of regions is in Supplementary Table 1. b, Uniform manifold approximation and projection (UMAP) embedding and clustering analysis of snATAC-seq data. A full list and description of cluster labels are in Supplementary Tables 2, 3. c, UMAP embedding and analysis as in b but coloured by subregions. Dotted lines demark major cell classes. ACA, anterior cingulate area; ACB, nucleus accumbens; AI, agranular insular area; AON, anterior olfactory nucleus; CA, cornus ammonis; CP, caudoputamen; DG, dentate gyrus; LSX, lateral septal nucleus; MOB, main olfactory bulb; MOs, secondary motor area; MOp, primary motor area; ORB, orbital area; PAL, pallidum; PFC, prefrontal cortex; PIR, piriform area; SSs, secondary somatosensory area; SSp, primary somatosensory area. d, Left, hierarchical organization of subclasses on chromatin accessibility. Middle, each subclass represents 1–13 cell types. Right, the number of nuclei in each subclass. e, Genome browser tracks of aggregate chromatin accessibility profiles for each subclass at selected marker gene loci that were used for cell cluster annotation. A full list and description of subclass annotations are in Supplementary Table 4. f, Bar chart representing the relative contribution of subregions to 160 cell types. g, Stacked histograms showing the regional specificity scores of subclasses and cell types of GAGAergic neurons (GABA), glutamatergic neurons (Glu) and non-neurons (NonN).
Fig. 2
Fig. 2. Identification and characterization of candidate CREs across mouse cerebral cell types.
a, Pie chart showing the fraction of cCREs that overlaps with different classes of annotated sequences in the mouse genome. TSS, transcription start site; TTS, transcription termination site. b, Venn diagram showing the overlap between cCREs defined in this study (in red) and the DHSs in the SCREEN database (in grey). c, Box plots showing sequence conservation measured by PhastCons score. ***P < 0.001, Wilcoxon rank-sum test. Boxes span the first to third quartiles (Q1 to Q3), horizontal line denotes the median, and whiskers show 1.5× the interquartile range (IQR). d, Bar chart showing the fold enrichment of cCREs within the different states of mouse brain chromatin annotated with a 15-state ChromHMM model. e, Density map comparing the median and maximum variation of chromatin accessibility at each cCRE across cell types. Each dot represents a cCRE. Red box highlights elements with low chromatin accessibility variability across clusters. f, Stacked bar plot showing the fraction of cCREs with low variability overlapping with distinct genomic features. g, Heat map showing association of the 43 subclasses (rows) with 42 cis-regulatory modules (top, from left to right). Columns represent cCREs. A full list of subclass or module associations is in Supplementary Table 9, and the association of cCREs to modules is in Supplementary Table 10. CPM, counts per million.
Fig. 3
Fig. 3. Regional specificity of cell types correlates with chromatin accessibility at cCREs.
a UMAP embedding of SSTGA cell types. b, Normalized chromatin accessibility of 50,079 cell-type-specific cCREs. c, Motif enrichment analysis for cell-type-specific cCREs. d, Expression level of genes encoding members of the KLF transcription factor family in SSTGA10. Expression values of Sst-Chodl cells were extracted from the Allen Brain Atlas (atlas.brain-map.org). e, UMAP embedding of astrocyte cell types. f, Contribution of brain regions to each astrocyte cell type. CNU, cerebral nuclei; OLF, olfactory area; HPF, hippocampus. g, Normalized accessibility of astrocyte cell-type-specific cCREs. h, Motif enrichment in astrocyte cell-type-specific cCREs. Hox, homeobox; HTH, helix-turn-helix; Zf, zinc-finger. i, Genome browser view showing ASCN-specific cCREs at the Olig2 locus. j, Heat map showing the fraction of overlap between spatially mapped genes from ISH in different brain structures and genes specific to astrocyte cell types. ***P < 0.0001, Fisher’s exact test. k, Bar plot showing log2-transformed expression value of Itih3 in each brain region from ISH experiments. CTXsp, cortical subplate. Images downloaded from © 2004 Allen Institute for Brain Science. Allen Mouse Brain Atlas. Available from: atlas.brain-map.org. l, Views of ISH experiment from the Allen Brain Atlas, showing spatial expression of Itih3 in the pallidum.
Fig. 4
Fig. 4. Integrative analysis of gene regulatory programs in different cerebral cell types.
a, Schematic overview of the computational strategy used to identify cCREs that are positively correlated with transcription of target genes. b, In total, 129,404 pairs of positively correlated cCRE and genes (highlighted in red) were identified (FDR < 0.01). Grey filled curve shows distribution of Pearson’s correlation coefficient (PCC) for randomly shuffled cCRE–gene pairs. c, Heat map showing chromatin accessibility of putative enhancers (left) and expression of linked genes (right). Genes are shown for each putative enhancer separately. UMI, unique molecular identifier. d, Enrichment of known transcription factor (TF) motifs in distinct enhancer–gene modules. Known motifs from HOMER with enrichment P value < 10−10 are shown. e, UMAP embedding of cell types involved in adult neurogenesis at the SGZ (top) and SVZ (bottom). f, Predicted transcription factors in different cell types involved in neurogenesis in the SGZ and SVZ. g, UMAP embedding of NIPCs and radial glia-like cells coloured by cell type (top) and brain region (bottom). h, Heat map showing the differential cCREs of NIPCs between the SGZ and SVZ. i, Genome browser tracks showing representative differential cCREs of NIPCs in the SGZ (top) and SVZ (bottom). j, Representative images of transgenic mouse embryos showing LacZ reporter gene expression under the control of the indicated enhancers that overlapped the differential cCRE in i (dotted line). Images were downloaded from the VISTA database (https://enhancer.lbl.gov).
Fig. 5
Fig. 5. Human orthologues of cerebral cCREs are enriched for noncoding risk variants for neurological diseases and traits in a cell-type-restricted manner.
Heat map showing the results of linkage disequilibrium score regression analysis of the noncoding variants associated with the indicated traits or diseases in the human orthologues of cCREs identified from 43 subclasses of mouse cerebral cell. *FDR < 0.05, **FDR < 0.01, ***FDR < 0.001. Displayed are all the 43 subclasses and traits or diseases with at least one significant association by linkage disequilibrium score regression analysis (FDR < 0.05).
Extended Data Fig. 1
Extended Data Fig. 1. Anatomic maps of the 45 dissections in the adult mouse cerebrum.
a, Schematic of brain tissue dissection strategy. Mouse brains were cut into 600-μm-thick coronal slices. b, Brain regions dissected from each coronal slice are marked according to the Allen Brain Reference Atlas. The frontal view of each slice from slices 1–11 is shown, with the dissected regions alphabetically labelled on the left, and the anatomic labelling listed on the right. A detailed list of the dissected regions and the full anatomic labelling can be found in Supplementary Table 1.
Extended Data Fig. 2
Extended Data Fig. 2. Quality control metrics of the snATAC-seq datasets.
a, Box plots showing the distribution of mapping ratios (the fraction of the mapped sequencing reads) in replicates (rep) 1 and 2 of snATAC-seq experiments from each brain dissection. b, Box plots showing the distribution of the number of proper read pairs (reads are correctly oriented) in replicates 1 and 2 of snATAC-seq experiments. c, Box plots showing the distribution of numbers of unique chromatin fragments detected in replicates 1 and 2 of snATAC-seq experiments. d, Box plots showing the distribution of number of unique barcodes captured in replicates 1 and 2 of snATAC-seq experiments. e, Frequency distribution plot showing the fragment size distribution of each snATAC-seq dataset. f, Heat map showing the pairwise Spearman correlation coefficients between snATAC-seq datasets. The column and row names consist of two parts: brain region name and replicate label. g, Dot plot illustrating fragments per nucleus and individual TSS enrichment. Nuclei in top right quadrant were selected for analysis (TSS enrichment > 10 and > 1,000 fragments per nucleus). h, Fraction of cell collision estimated from species mix samples. Inset shows the fraction of potential barcode collisions detected in snATAC-seq libraries using a modified version of Scrublet. i, Number of nuclei retained after each step of quality control. j, Distribution of TSS enrichment. k, The number of uniquely mapped fragments per nucleus for individual libraries. l, The number of nuclei passing quality control for subregions. All box plots are stylized as in Fig. 2c.
Extended Data Fig. 3
Extended Data Fig. 3. Cell clustering based on snATAC-seq data.
a, Schematic diagram of the cell clustering pipeline. b, UMAP embedding of a representative cell clustering at different resolutions from 0.1 to 1.0 using Leiden algorithm. c, Consensus matrix from 300 iterative clustering runs with different resolutions. d, Cluster stability at different resolutions was assessed using a CDF of consensus matrices. High values illustrate nuclei that clustered together in most cases. e, The PAC and dispersion coefficient at different resolutions. A low PAC and high dispersion coefficient indicates the best and most stable clustering. f, Optimal cell clustering result for a representative subclass (resolution = 0.5). g, The CDF curve of consensus matrix at an optimal resolution for every subclass.
Extended Data Fig. 4
Extended Data Fig. 4. Summary statistics of snATAC-seq datasets in the current study.
a, Violin plots showing the log-transformed number of unique fragments per nucleus in each cell subclass identified. b, Violin plots showing the TSS enrichment in each nucleus of each subclass. c, Acceptance rate from k-nearest neighbour batch effect test (kBET) for each subclass of cerebral cells. d, Distribution of the local inverse Simpson’s index (LISI) scores for cells in each subclass.
Extended Data Fig. 5
Extended Data Fig. 5. Reproducibility of the cell type composition of each brain region estimated from single-cell chromatin accessibility profiles.
a, Bar plot showing the fraction of nuclei from two biological replicates for each of the 43 subclasses of mouse cerebral cells discerned from snATAC-seq data. b, CDF plot showing the consistency of the estimated fractions of each subclass of cerebral cells between the two biological replicates. Kolmogorov–Smirnov test shows no significant difference between the biological replicates. c, Box plots of the P values of Kolmogorov–Smirnov tests illustrate consistent results between the two biological replicates for each subclass of cerebral cells across major brain regions, sub-regions and brain dissections tested. d, Heat map showing the pairwise Spearman correlation coefficients of cell type composition between each replicate of brain dissections. The column and row names consist of two parts: brain region name and replicate label. For example, MOp-1.1 represents the replicate 1 of the first brain dissection of the primary motor cortex (MOp-1). The embedded box plot shows the distribution of Spearman correlation coefficients between two biological replicates, replicates from intra-major brain regions and inter-major brain regions. ***P < 0.001, Wilcoxon rank-sum test. Box plots are stylized as in Fig. 2c.
Extended Data Fig. 6
Extended Data Fig. 6. Maker genes used for annotation of different subclasses of cerebral cells.
a, Gene activity scores for marker genes used for subclass annotation. The UMAP embedding from Fig. 1b is shown for reference. b, Heat map showing the gene activity scores of the marker genes (columns) used for subclass annotation across the 43 subclasses (rows). The colour gradient bar is at the top right.
Extended Data Fig. 7
Extended Data Fig. 7. Iterative clustering identifies cell types for the subclasses of cerebral cells.
a, UMAP embedding of the three classes of cerebral cells, namely non-neurons (left), GABAergic neurons (middle), and glutamatergic neurons (right). bd, The subclasses of cerebral cells that can be further divided into cell types are shown for the non-neurons (b), GABAergic neurons (c) and glutamatergic neurons (d). UMAP embedding is shown for each subclass, with the subclass label shown on the top left, and cell-type labels or cluster numbers on each cell type.
Extended Data Fig. 8
Extended Data Fig. 8. Comparison of cell type compositions in the mouse primary motor cortex determined by snATAC-seq using combinatorial indexing and droplet-based snATAC-seq (10x Genomics) platforms.
a, Individual clustering of snATAC-seq data generated using the single-cell combinatorial indexing (sci) and the droplet-based barcoding (10x) for the primary motor cortex (dissection: 3C). b, c, Co-embedding and joint clustering of snATAC-seq data generated from sci and 10x platforms. b, Dots are coloured by cell clusters. c, Dots are coloured by the experimental platforms (sci or 10x). d, Heat map illustrating the overlap between cell cluster annotations from both platforms. Rows show cell types from combinatorial barcoding; columns show cell types from the droplet-based platform. The overlap between the original clusters and the joint cluster was calculated (overlap score) and plotted on the heat map. e, CDF plot showing the fraction of nuclei in individual cell types for each platform.
Extended Data Fig. 9
Extended Data Fig. 9. Hierarchical tree depicting the relationships between different subclasses of cerebral cells.
Dendrogram tree was constructed using 1,000 rounds of bootstrapping for the subclasses using R package pvclust. Nodes are labelled in grey, approximately unbiased P values (in red) and bootstrap probability values (in green) are labelled at the shoulder of the nodes, respectively. A full list and a description of cell cluster labels are in Supplementary Table 3.
Extended Data Fig. 10
Extended Data Fig. 10. The chromatin-accessibility-based cell clustering matches transcriptomics-based cell taxonomy.
a, Heat map showing the similarity between A-type (accessibility) and T-type (transcriptomics) based cell cluster annotations. Each row represents an A-type cluster (a total of 160) and each column represents a T-type cluster (a total of 100). Similarity between original clusters and the joint cluster was calculated as the overlap score, which defined as the sum of minimal proportion of cells/nuclei in each cluster that overlapped within each co-embedding cluster. The overlap score varied from 0 to 1 and was plotted on the heatmap. Joint clusters with an overlap score of >0.5 are highlighted using black dashed line and labelled with RNA–ATAC joint cluster ID. For a full list of cell type labels and descriptions see Supplementary Table 4. b, c, Bar plots indicating the number of clusters that matched (dark grey) or did not match (light grey) clusters from the other modality. b, 155 out of 160 A-types had a matching T-type cluster. c, 84 out of 100 T-types had a matching A-type cluster.
Extended Data Fig. 11
Extended Data Fig. 11. Cellular composition of brain regions, subregions and dissections.
a, Cluster dendrogram based on chromatin accessibility as in EDF8. bd, Normalized percentages (pct) of each subclass in the four major regions (b), the subregions (c), and the dissected regions (d) are shown as different sized dots. The sizes of dots correspond to the percentage, and the colours of the dots indicate the major brain regions, subregions or dissections. Bar plots to the right show the total number of nuclei sampled for each region, subregion, or dissection.
Extended Data Fig. 12
Extended Data Fig. 12. Distribution of each cerebral cell type across different brain regions, subregions and anatomical dissections.
ac, Normalized percentages of cells from each major brain region (a), subregion (b) and dissection (c) are shown as dots in cell types, with the sizes of the dots reflecting the percentages. d, Scatter plot showing the reproducibility of regional specificity of each cell type between two biological replicates of brain dissections.
Extended Data Fig. 13
Extended Data Fig. 13. The regional specificity of cerebral cell types defined on the basis of snATAC-seq datasets is consistent with ISH patterns of cell-type-specific genes in the mouse brain.
a, Bar plots showing regional specificity of each cell type determined from relative contribution from five different brain regions, including isocortex, olfactory areas, hippocampal formation, striatum and pallidum. Dot plots showing percentages of each major brain region in the subclass of cerebral cells. The size of each dot reflects the relative contribution of the brain region as indicated by the legend to the right of the panel, and the colour of the dots indicates the brain region. b, Bar plots showing the coefficients of variation calculated with the average normalized ISH signals of cell-type-specific marker genes for the corresponding cell subclasses across five brain regions. The heat map shows the average normalized ISH signals of cell subclass-specific marker genes. The cell subclass-specific marker genes were identified for joint cell subclasses from RNA-ATAC integrative analysis (Extended Data Fig. 9). For a full list of ISH data of cell subclass-specific genes, see Supplementary Table 6. c, Scatter plot shows the correlation between the coefficients of variation of marker gene ISH signals and the regional specificity score calculated based on snATAC-seq data for 32 joint cell subclasses from RNA-ATAC integration analysis (Pearson correlation coefficients (PCC) = 0.55). d, Box plots show the Pearson correlation coefficients (PCC) calculated between cell composition across brain regions based on snATAC-seq data and the spatial distribution ISH signals of cell subclass-specific genes across the five main brain regions for each major brain cell subclass. Box plots are stylized as in Fig. 2c.
Extended Data Fig. 14
Extended Data Fig. 14. Brain region specificity of different subclasses and cell-types.
a, UMAP embedding of the subclasses of GABAergic neurons. b, Sub-region composition of dopaminergic neurons, D1MSN and D2MSN, and MXD illustrates most MXD neurons were detected in the pallidum. c, Gene activity score for gene Isl1 in GABAergic neurons predicts expression in MXD. d, RNA ISH and bar plot illustration of expression levels show the highest abundance of Isl1 in the predicted region, pallidum (blue). Data and images were downloaded from © 2004 Allen Institute for Brain Science. Allen Mouse Brain Atlas. Available from: atlas.brain-map.org. e, UMAP embedding of PVGA cell types. f, Sub-region composition for each PVGA cell type illustrates that majority of PVGA-7 were detected in nucleus accumbens and caudoputamen. g, Gene activity scores for Kit and Pde3a predict expression in one PVGA cell type (PVGA-7). h, RNA ISH and bar plot illustration of expression levels show expression of Kit and Pde3a in predicted regions, nucleus accumbens and caudoputamen. Predicted region in blue, other sampled regions in grey and non-sampled regions in white. i, UMAP embedding of CA1 glutamatergic neurons (CA1GL) cell types. j, Dissection composition in each CA1GL cell type. k, Gene activity of gene Dcn in CA1GL cell types. l, Density of expression level of Dcn viewed in BrainExplorer (https://mouse.brain-map.org/static/brainexplorer) shows expression cornu ammonis field 1 (CA1). m, RNA ISH and bar plot show highest expression of Dcn CA1 and hippocampal formation. The predicted region is coloured in blue, other sampled regions in grey and non-sampled regions in white.
Extended Data Fig. 15
Extended Data Fig. 15. Statistics of peak calling from snATAC-seq data from each cell type.
a, Schematic diagram of peak calling and filtering pipeline. b, Number of peaks retained after each peak calling and filtering step. c, Scatter plots showing the relationship between the number of nuclei in each cluster and the number of peaks before score-per-million (SPM) correction (left) and after SPM correction (right). d, Density distribution plot showing the fraction of cells per cell type in which a peak was accessible and a corresponding background for each cell types. For each cell type, the background is defined as same number of non-DHS and non-peak regions randomly picked from genome. e, Fraction of different peak sets that overlap with annotated transcriptional start sites, introns, exons, transcriptional termination sites (TTS), and intergenic regions in the mouse genome. f, Fraction of different peak sets that overlap with DHSs. g, Box plot showing sequence conservation in different peak sets and the control set. h, Enrichment analysis of different peak sets with a 15-state ChromHMM model in the mouse brain chromatin. All box plots are stylized as in Fig. 2c.
Extended Data Fig. 16
Extended Data Fig. 16. Differentially accessible cCREs across different cell types and brain regions.
a, Distribution plot of the Jensen–Shannon specificity scores of the differentially accessible cCREs and invariable cCREs. The vertical line shows the cutoff at FDR of 0.05. b, Stacked bar charts showing the number of differential cCREs between cell types for subclasses. c, A graph shows the numbers of differential cCREs in each major brain region. d, A graph shows the numbers of differential cCREs across subregions. The sizes of dots correspond to the number of differential cCREs (log10-transformed) found in each cell type between brain regions, and the colours of the dots indicate the major brain regions in c, subregions in d.
Extended Data Fig. 17
Extended Data Fig. 17. Variation of chromatin accessibility across different cell types within the subclasses of cerebral neurons.
a, UMAP embedding and gene activity scores of Chodl show specificity for one SSTGA cell type. b, Bar chart showing the number of differentially accessible cCREs shared by 1, 2, 3 or 4 cell types of SSTGA. c, Bar chart showing the number of cell-type-specific cCREs, categorized by proximal (black) and distal regions (light grey). d, UMAP embedding of MSGA cell types. e, Bar chart showing the number of differentially accessible cCREs shared by 1, 2, 3, 4 and 5 subtypes of MSGA. f, Number of specific accessible proximal and distal cCREs for each MSGA cell type. g, Heat map showing the normalized accessibility of the cell-type-specific cCREs across different MSGA cell types. h, Heat map showing the promoter accessibility at selected marker genes in each MSGA cell type. i, ISH data (https://portal.brain-map.org) of the marker genes of each MSGA cell type. j, Motif enrichment of uniquely accessible cCREs and expression level of transcription factors in cholinergic neurons (MSGA10). The RNA expression of cholinergic neurons was downloaded from mouse brain atlas (http://mousebrain.org).
Extended Data Fig. 18
Extended Data Fig. 18. Astrocyte cell types exhibit regional specificity and differential chromatin accessibility.
a, UMAP embedding of astrocytes, coloured by cell type (left), fragment depth (middle), and replicate (right). b, Heat map illustrating the overlap between A-type and T-type cell cluster annotations. Each row represents a snATAC-seq subtype and each column represents scRNA-seq cluster (http://mousebrain.org). The overlap between original clusters and the joint cluster was calculated (overlap score) and plotted on the heat map. c, Heat map showing overlap score with cell type defined by integration of multiple transcriptomic and epigenomic modalities. d, Smoothed gene activity scores of marker genes for astrocytes from white and grey matter. e, Smoothed gene activity scores of representative cortical layer-specific marker genes for astrocyte. f, Upset plot showing intersections of differentially accessible cCREs between cell types. cCREs only presented in one cell type are defined as cell-type-specific cCREs. g, Gene activity score for GLI transcription factor family members in astrocyte cell types. h, Smoothed gene activity scores of Oligo2, Itih3, Agt and Slc6a11 in astrocytes show stronger activity in ASCN. i, Genome browser tracks of aggregate chromatin accessibility profiles for each astrocyte cell type at gene Itih3, Slc6a11 and Agt locus. j, Views of ISH experiments from Allen Brain Atlas (atlas.brain-map.org) showing predominant expression of Slc6a11 and Agt in the pallidum. Bar plot showing expression values from ISH experiments in brain structures. Data and images were downloaded from © 2004 Allen Institute for Brain Science. Allen Mouse Brain Atlas. Available from: atlas.brain-map.org. k, UMAP embedding of astrocyte coloured by brain subregions. l, Motif enrichment in cCREs in ASCG with regional specificity.
Extended Data Fig. 19
Extended Data Fig. 19. Characterization of predicted cCRE–target gene pairs.
a, Histogram illustrating the 1-D genomic distances between positively correlated distal cCRE and putative target gene promoters. b, Box plot showing that genes were linked with a median of 7 putative enhancers. Box plot is stylized as in Fig. 2c. c, Accessibility at promoter regions across RNA–ATAC joint cell types; order is as in Fig. 4c. d, Enrichment of sequence motifs for CTCF, MEF2 and RFX from de novo motif search in the putative enhancers of module M1 using HOMER. For a full list, see Supplementary Table 21. e, Venn diagram illustrating the overlap of putative target genes of cCREs containing binding sites for MEF2, RFX and CTCF, respectively. f, Gene Ontology (GO) analysis of the putative target genes of each factor in module M1 was performed using Enrichr. The combined score is the product of the computed P value using the Fisher exact test and the z-score of the deviation from the expected rank. g, h, Examples of distal cCRE overlapping a RFX motif (g) or a MEF2 motif (h) and positively correlated putative target genes. Motifs were identified using de novo motif search in HOMER. Genome browser tracks showing chromatin accessibility, mCG methylation levels (see accompanying manuscript) and positively correlated cCRE and gene pairs.
Extended Data Fig. 20
Extended Data Fig. 20. CTCF-associated cCRE–gene pairs.
a, Enrichment of CTCF binding sites from 8-week-old mouse forebrain in cCRE modules. b, t-SNE of sn-m3C-seq cells (n = 5,142) coloured by clusters. This panel was replicated from the accompanying paper for illustration purposes. c, Percentage of cCRE from module M1 with and without CTCF binding overlapping with contact loops identified from snm3C-seq in hippocampus. Shuffled loops are generated by randomly flipping one of the anchors. d, Genome browser track view at the Nsg2 locus as an example for CTCF dependent loops. Displayed are chromatin accessibility profiles for several neuronal and non-neuronal clusters, positively correlated cCRE–gene pairs, H3K27ac in postnatal day 0 (P0) mouse forebrain and CTCF from in cortex from 8-week-old mouse. The triangle heat map (top) shows chromatin contacts in DG neurons and MGC derived from snm3C-seq data (see accompanying manuscript). DG-specific loops are highlighted by boxes.
Extended Data Fig. 21
Extended Data Fig. 21. Predicted transcription factors involved in adult neurogenesis lineages.
a, Normalized accessibility at marker genes in NIPCs. b, RNA ISH shows expression levels of marker genes of NIPCs in the SGZ and SVZ. Images were downloaded from © 2004 Allen Institute for Brain Science. Allen Mouse Brain Atlas. Available from: atlas.brain-map.org. c, Genome browser tracks showing representative differentially accessible cCREs in NIPCs from SVZ.
Extended Data Fig. 22
Extended Data Fig. 22. Mouse cerebral cCREs maps help to interpret potential casual risk variants of neurological diseases.
a, Bar plot showing the percentage of cCREs identified in the current study with homologous sequences in the human genome (using reciprocal homology search) (Methods). b, Chromatin accessibility profiles for several neuronal and non-neuronal cell types, and posterior probabilities of association (PPA) for potential casual variants surrounding the homologous region in the mouse genome to a schizophrenia-associated locus. Grey bars highlight cCREs overlapping potential causal variants. rsID or hg19 coordinates of overlapped variants are labelled. ‘:D’ denotes that the alternative allele is a deletion; ‘:I’ denotes an insertion. Predicted positively correlated cCRE–gene pairs are shown in red arcs.
Extended Data Fig. 23
Extended Data Fig. 23. Nuclei gating strategy.
After tagmentation, nuclei were pooled and stained with DRAQ7. First, potential nuclei were identified using forward scatter (FSC) area and back scatter (BSC) area (left dot plot). Next, potential doublets were removed based on BSC and FSC signal width (two middle dot plots). Finally, 20 diploid nuclei (2n) were sorted into each well of eight 96-wells plates (right dot plot).

References

    1. Paxinos, G. The Rat Nervous System 4th edn (2015).
    1. Zeisel A, et al. Molecular architecture of the mouse nervous system. Cell. 2018;174:999–1014. doi: 10.1016/j.cell.2018.06.021. - DOI - PMC - PubMed
    1. Saunders A, et al. Molecular diversity and specializations among the cells of the adult mouse brain. Cell. 2018;174:1015–1030. doi: 10.1016/j.cell.2018.07.028. - DOI - PMC - PubMed
    1. Tasic B, et al. Shared and distinct transcriptomic cell types across neocortical areas. Nature. 2018;563:72–78. doi: 10.1038/s41586-018-0654-5. - DOI - PMC - PubMed
    1. Hodge RD, et al. Conserved cell types with divergent features in human versus mouse cortex. Nature. 2019;573:61–68. doi: 10.1038/s41586-019-1506-7. - DOI - PMC - PubMed

Publication types

MeSH terms