Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2018 Nov;563(7729):72-78.
doi: 10.1038/s41586-018-0654-5. Epub 2018 Oct 31.

Shared and distinct transcriptomic cell types across neocortical areas

Affiliations
Comparative Study

Shared and distinct transcriptomic cell types across neocortical areas

Bosiljka Tasic et al. Nature. 2018 Nov.

Abstract

The neocortex contains a multitude of cell types that are segregated into layers and functionally distinct areas. To investigate the diversity of cell types across the mouse neocortex, here we analysed 23,822 cells from two areas at distant poles of the mouse neocortex: the primary visual cortex and the anterior lateral motor cortex. We define 133 transcriptomic cell types by deep, single-cell RNA sequencing. Nearly all types of GABA (γ-aminobutyric acid)-containing neurons are shared across both areas, whereas most types of glutamatergic neurons were found in one of the two areas. By combining single-cell RNA sequencing and retrograde labelling, we match transcriptomic types of glutamatergic neurons to their long-range projection specificity. Our study establishes a combined transcriptomic and projectional taxonomy of cortical cell types from functionally distinct areas of the adult mouse cortex.

PubMed Disclaimer

Conflict of interest statement

Competing interests The authors declare no competing interests.

Figures

Extended Data Fig. 1 |
Extended Data Fig. 1 |. Overview of sample collection.
a, One of the two brain regions, ALM or VISp, was dissected from an adult mouse (P53-P59 (n = 339), P51 (n = 1), and P63-P91 (n = 12), Supplementary Table 1). b, Example microdissection images from the most heavily sampled mouse genotype, Snap25-IRES2-cre/wt;Ai14/wt, from both cortical regions. For many samples, microdissection was used to isolate layer-enriched portions of the cortex. ALM lacks L4. The dissection images are representative of n = 21 processed Snap25-IRES2-cre/wt;Ai14/wt brains. c, Microdissected layers were processed separately for single-cell suspension. Each sample was digested with pronase, then triturated with pipettes of decreasing tip diameter (600, 300 and 150 μm). Individual cells were sorted into 8-well strip PCR tubes by FACS. d, SMART-Seq v4 was used to reverse-transcribe and amplify full-length cDNAs from each cell. cDNAs were then tagmented by Nextera XT, PCR-amplified, and sequenced on Illumina HiSeq2500. e, Common gates used for all FACS sorts: (1) Morphology gate excludes events with high side scatter and low forward scatter, which are largely cellular debris, and (2) SC-FSC and SC-SSC gates exclude samples with high forward scatter width and high side scatter width, respectively, to exclude cell doublets and multiplets. f, Example gating for live tdTomato+ or tdTomato cells. Cells sorted using the tdTom+ gate express the tdTomato reporter and have low DAPI fluorescence. This plot was generated from cell suspension isolated from a Snap25-IRES2-cre/wt;Ai14/wt animal, which expresses tdTomato in all neurons. The tdTom gate in this genotype was the main source of non-neuronal cells, which have low DAPI fluorescence and low tdTomato expression. Gating hierarchy and sorting statistics are shown above the FACS scatter plot. This gating strategy is representative of n = 21 processed Snap25-IRES2-cre/wt;Ai14/wt brains. g, To sort eGFP+ cells, we used the same debris and doublet gating described in e, then collected cells with high eGFP and low DAPI fluorescence (eGFP+ gate). This plot was generated from cell suspension isolated from VISp of a Ctgf-2A-dgcre/wt;Snap25-LSL-F2A-GFP/wt animal, which expresses eGFP in L6b neurons. Gating hierarchy and sorting statistics are shown above the FACS scatter plot. This gating strategy is representative of n = 4 processed Ctgf-2A-dgcre/wt;Snap25-LSL-F2A-GFP/wt brains. h, Genotype and layer sampling frequencies for all cluster-assigned cells (n = 23,822). PAN Cre-lines were used to broadly sample neurons in the cortex. Non-PAN lines were included to (1) enrich for cells that displayed poor survival in the isolation process (for example, Rbp4-cre_KL100 for L5 types and Pvalb-IRES-cre for Pvalb types); (2) enrich for rare cell types (for example, Ctgf-2A-dgcre for L6b); and (3) transcriptomically characterize cell types labelled by these lines. Bar plots show the number of cells sampled from each genotype and region (A, ALM; V, VISp). Bars are coloured according to the number of samples from microdissected layers or combinations of layers. i, Transgenic driver composition with respect to cell types for all cluster-assigned cells (n = 23,822). The stacked bar plot shows the proportion of cells in each cluster that were collected from each Cre line. Black bars represent cells collected from retrograde tracing experiments. These cells were labelled by a fluorophore-expressing virus or by a Cre-expressing virus together with a Cre-reporter transgenic line. Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).
Extended Data Fig. 2 |
Extended Data Fig. 2 |. scRNA-seq pipeline and analysis workflow.
a, Workflow diagram outlines the path from individual experimental animals to quality control-qualified scRNA-seq data. At multiple points throughout sample processing, cell and sample metadata were recorded in a laboratory information management system (LIMS, labelled as L), which informs quality control processes. Samples must pass quality control benchmarks to continue through sample processing. b, Clustering procedure. Cells were first divided into broad cell classes based on known marker gene expression, then were segregated into clusters 100 times using a bootstrapped procedure that sampled 80% of cells each time. Within each iteration, cells were split by selection of high variance genes followed by PCA or WGCNA dimensionality reduction. Principal components and WGCNA eigengenes were then used to cluster samples by hierarchical clustering or graph-based Jaccard-Louvain clustering algorithm, depending on the number of cells (clustering module, red box). Clusters were checked for over-splitting or termination criteria (merging module, purple box). Each cluster was used as input for a further round of splitting until termination criteria were met. After 100 rounds of clustering, the frequencies with which samples were clustered together were used as a similarity measure to hierarchically cluster the samples. The resulting hierarchical clustering tree was then dynamically cut, and the resulting clusters were checked for over-splitting. Finally, cells were subjected to validation by a centroid classifier. After 100 rounds of validation, cells that were mapped to the same cluster in more than 90 out of 100 trials were assigned ‘core’ cell identity (n = 21,195), and cells with lower scores were assigned ‘intermediate’ cell identity (n = 2,627). Most intermediate cells were mapped to only two clusters (2,492 out of 2,627; 94.9%). c, The number of cells at each step in our analysis pipeline. The identification of doublets and low-quality clusters is described in the Methods. Some high-quality cells (n = 452) from the retro-seq dataset were not used for projection analysis because stereotaxic injections were determined postbrain section to be unsatisfactory, because: (1) the incorrect target was injected, (2) the injection was too close to the collection site, or (3) strong injection tract labelling was detected (Methods). These cells were kept in transcriptomic clusters, but were not used to inform the specificity of glutamatergic projections. Only cells from the annotated retro-seq dataset (n = 2,204) were used for connectivity analyses in Fig. 3 and Extended Data Fig. 10a, b.
Extended Data Fig. 3 |
Extended Data Fig. 3 |. Co-clustering frequency matrix, confusion scores and intermediate cells.
a, The co-clustering frequency matrix (centre) for up to 100 cells per cluster selected at random (n = 10,820). Some cell types, for example certain Pvalb types (middle of enlarged panel), display pronounced co-clustering. t-SNE was used to visualize the similarity of gene expression patterns in two dimensions for all cluster- assigned cells (n = 23,822). Individual cells in t-SNE plots were coloured by: cell class (GABAergic, red; glutamatergic, blue; glia, grey; endothelial cells, brown), animal donor sex (female, pink; male, purple), dissected brain region (ALM, black; VISp, grey), confusion score (low-blue, highred), and the number of genes detected (low-blue, high-red). b, Pairwise correlation, differential gene expression and co-clustering for all 133 clusters using all cluster-assigned cells (n = 23,822). c, Confusion scores for all cluster-assigned cells (n = 23,822) segregated by clusters. For each cell, the confusion score is defined as the ratio of the probabilities for that cell to be clustered with the cells from its second best cluster and with the cells from the final cluster (also the best cluster except for rare exceptions). Thus, confusion score is a measure of the confidence of cell type assignment: the lower the value, the less frequently a cell was grouped with cells from a different cluster. Each blue dot is a confusion score for a single cell, median values are shown as red dots; whiskers are twenty-fifth and seventy-fifth percentiles. d, Fraction of cluster-assigned cells (n = 23,822) annotated as core (coloured) or intermediate (black) for each cluster. In total, 21,195 cells (88.97%) were assigned core, whereas 2,627 (11.03%) were assigned intermediate identity. e, f, We performed 100 rounds of bootstrapped clustering to determine the confidence of our hierarchical clustering structure (Methods). The final dendrogram generated by this method (e), with branches coloured by their bootstrapped confidence: light grey (low confidence), maroon (moderate confidence), and black (high confidence). For figures, we used the dendrogram in f, in which we collapsed branches with confidence lower than 0.4.
Extended Data Fig. 4 |
Extended Data Fig. 4 |. Sequencing depth and gene detection for quality-control-qualified cells segregated by cluster.
a, Sequencing depth for all cluster-assigned cells (n = 23,822), grouped by cell type. Cells were sequenced to a median depth of 2.54 million reads (min = 0.103 M; max = 13.84 M). Median values in millions of reads are adjacent to the cell type labels. Median values are shown as red dots; whiskers are twenty-fifth and seventy-fifth percentiles. b, The number of detected genes (reads detected in exons >0) varies by cell type. Gene detection is shown for each cluster-assigned cell (n = 23,822). Median values are shown adjacent to the cell type labels. Median number of genes detected across all cells is 9,462 per cell (min = 1,445; max = 15,338). Median values are shown as red dots; whiskers are twenty-fifth and seventy-fifth percentiles. Samples with less than1,000 detected genes were excluded at a prior quality control step (Extended Data Fig. 2). c, Comparison of gene detection between our previous study and this study for all cluster-assigned cells (left) and cells grouped according to major classes (right). d, Comparison of gene detection between cell classes for core and intermediate cells. The higher gene detection for intermediate non-neuronal cells may be due to contamination with other non-neuronal cells. e, Comparison of gene detection within cell classes between VISp and ALM. In c-e, medians are shown as dots and values are listed below each distribution; whiskers are twenty-fifth and seventy-fifth percentiles. Sample size (number of cells) for each analysed group is listed between panels a and b, and below the graphs for panels c-e.
Extended Data Fig. 5 |
Extended Data Fig. 5 |. Markers used for cell type assignment.
a, Marker panel of 88 genes for cell classes. For each cluster, 25% trimmed mean expression values are shown (n = 23,822 cluster-assigned cells; 133 clusters). Maximum expression values (counts per million reads, CPM) are shown to the right of the heat map. Pan-neuronal markers (for example, Snap25) were used to assign neuronal type identity. Glutamatergic (for example, Slc17a7 and Slc17a6) and pan-GABAergic (for example, Gadl, Gad2 and Slc32a1) markers were used to assign glutamatergic and GABAergic identity, respectively. Known non-neuronal markers were used to assign non-neuronal identities. b, Marker panel for glutamatergic cell types. For each cluster, 25% trimmed mean expression values are shown (n = 11,905 cells; 56 clusters). Layer-specific markers were used to assign layer identity (for example, Cux2, Rorb, Deptor and Foxp2). To assign final names to types, subclass and/or layer-markers were combined with unique or other specific markers, many of which are novel. Once the identity was assigned, previously unknown genetic bases of phenotypes could be discovered. For example, for the Cajal-Retzius cell type CR-Lhx5, which has been shown by immunohistochemistry to contain glutamate but not GABA, we show that its glutamatergic phenotype stems from the expression of mRNA encoding VGLUT2 (Slc17a6), and not from the other glutamate transporters (Slc17a7 and Slc17a8). Note that some markers that appear non-specific, provide preferential labelling of specific types when used to make random-insertion transgenic BAC lines. For example, Efr3a-cre_NO108 specifically labels near-projecting types, although Efr3a mRNA is ubiquitously detected in all neurons. However, its expression is about fivefold higher in near-projecting types; this may contribute to preferential labelling of the near-projecting types by this BAC transgenic Cre line. c, Marker panel for GABAergic cell types. For each cluster, 25% trimmed mean expression values are shown (n = 10,534 cells; 61 clusters). GABAergic subclasses were assigned based on the expression of Lamp5, Serpinf1, Sncg, Vip, Sst and Pvalb. Final names were assigned based on unique combinations of markers.
Extended Data Fig. 6 |
Extended Data Fig. 6 |. Comparison of cell types to those defined previously.
We compared the similarity of clustering results for our current dataset and our previous dataset by nearest centroid classification (Methods). We mapped the 1,424 cells from the previous dataset to 133 clusters from the current study (a) and vice versa: 23,822 current cells to 49 previous clusters. b, Some types were largely absent from our previous study. For example, the Meis2 type was not detected probably owing to rarity and white-matter confinement, whereas the L5 near-projecting types were missed owing to reliance on Rbp4-cre_KL100 to isolate L5 cells. We now find that, in contrast to pan-neuronal (Snap25-IRES-cre) or pan-glutamatergic Cre lines (Slc17a7-IRES-cre), Rbp4-cre_KL100 labels all L5 types except L5 near-projecting (Extended Data Fig. 8).
Extended Data Fig. 7 |
Extended Data Fig. 7 |. Saturation of cell type detection.
We tested the ability of our of computational approach to segregate cell types based on our standard cluster separation criteria (Methods) upon decreasing cell numbers included in clustering. The total number of sampled cells and derived clusters in each trial is shown at the top of each column. Each row is a cluster from our full dataset analysis. Each coloured box represents the number of cells from each cluster (row) that were included in the specific downsampled dataset (column). Clusters that merge are shifted to the left of their respective columns and merging is indicated by arcs and/or removal of lines between adjacent clusters. For most types, we were able to segregate them from their related types with many fewer cells than present in the complete dataset. Further cell sampling may reveal additional diversity, especially for rare cell types.
Extended Data Fig. 8 |
Extended Data Fig. 8 |. Cell-type labelling by recombinase driver lines.
Driver line names are listed on top (columns, n = 55) and cell types on the left (rows, n = 133). Coloured discs represent the numbers of cells detected for each type (cell numbers are proportional to disc surface area). This plot is based on 20,758 cells isolated from transgenic mice that were collected as tdT+ or GFP+ by FACS. Note that the relative proportions of cell types obtained in these experiments are likely to be affected by cell type-specific differences in survival during the isolation procedure and by samping via layer-enriching dissections.
Extended Data Fig. 9 |
Extended Data Fig. 9 |. Non-neuronal cell types.
a, Non-neuronal cells (n = 1,383 cells) are divided into two major branches according to their developmental origin: neuroectoderm-derived branch, which contains astrocytes and oligodendrocytes (left), and non-neuroectoderm-derived, which includes immune cells (microglia, perivascular macrophages), blood vessel-associated cells (smooth muscle cells, pericytes and endothelial cells), and vascular leptomeningeal cells (VLMCs, right). All have been detected in both ALM and VISp, except VLMC-Osr1-Cd74, which may be rare (12 cells total) and may also be detected in ALM with further sampling. Violin plots represent distributions of individual marker gene expression in single cells within each cluster. Rows are genes, median values are black dots, and values within rows are normalized between 0 and the maximum expression value for each gene (right edge of each row) and displayed on a linear scale. We identify astrocytes based on expression of Aqp4. Oligodendrocyte lineage cells express Sox10. Oligodendrocyte precursor cells are marked by expression of Pdgfra and absence of Col1a1,, with dividing oligodendrocyte precursor cells expressing Ccnb1. Newly generated oligodendrocytes (Oligo-Rassf10) express Enpp6, whereas myelinating oligodendrocytes (Oligo-Serpinb1a/Synpr) express Opalin. Two related types of immune cells coexpress Cd14 and Fcgr3, and can be identified as microglia by expression of Siglech and Tmem119, and perivascular macrophages by expression of Mrc1, Lyve1 and Cd163. We identify two related types of blood vessel- associated cells as pericytes and smooth muscle cells (SMCs) based on their expression of Cspg4 and Acta2 (reviewed previously). We assign SMC identity to SMC-Acta2, which strongly expresses Acta2 (smooth muscle actin). We assign pericyte identity to the Peri-Kcnj8 cluster based on specific expression of pericyte markers Kcnj8 and Abcc9. We define additional markers uniquely expressed in this cell type (Atp13a5, Art3, Pla1a and Ace2) that may help solidify pericyte identity in future studies. We identify one type of endothelial cells (Endo-Slc38a5) based on expression of previously characterized endothelial markers, Tek, Pdgfb, Nos3, Eltd1 and Pecam1. We identify VLMC types based on their unique expression of Lum and Col1a1,. We define four types based on differential gene expression. Markers examined in b and c are highlighted by arrows and colours, respectively. b, RNA ISH for some of non-neuronal markers from the Allen Brian Atlas. Images contain regions of interest from representative sections selected from individual whole-brain RNA ISH experiments. Spp1 mRNA is detected in the meninges and scattered in the cortex, corresponding to VLMCs, as well as select Pvalb and Sst types. Gja5 mRNA labels vessel-like structures in the grey matter, probably corresponding to the Endo-Slc38a5 cluster. Slc47a1 is specific to the VLMC-Osr1-Cd74 type, which appears to be restricted to pia. Dcn is expressed in three VLMC types, and its expression is seen in the pia and vessel-like structures in the cortex. The number of whole-brain experiments per gene available in the Allen Brain Atlas is as follows: Spp1: n = 4 brains (2 sagittal, 2 coronal); Gja5: n = 2 brains (2 sagittal); Slc47a1: n = 1 brain (sagittal); Dcn: n = 3 brains (2 sagittal, 1 coronal). c, Single-molecule RNA FISH by RNAscope for Osr1, Spp1 and Lum mRNAs shows labelling at the pial surface and surrounding vessel-like structures. Images are representative of two independent RNAscope experiments on n = 2 brains. On the basis of the coexpression of marker genes shown in a, VLMCs within the grey matter (expanded region 2) are probably the VLMC-Spp1-Col15a1 type, whereas VLMCs in pia between cortex and tectum (expanded region 3) are probably VLMC-Osr1-Mc5r. The surface VLMCs (expanded region 1) appear to express all three markers, which are usually not co-detected in single cells by RNA-seq (a). This finding could be explained by a possibility that region 1 may contain two or more types of spatially appositioned VLMCs, for example, VLMC-Osr1-Cd74 (based on Slc47a1 expression shown in b), as well as one of the Lum+ types (VLMC-Osr1-Mc5r and/or VLMC-Spp1-Col15a1).Extended Data Fig. 10 | See next page for caption.
Extended Data Fig. 10 |
Extended Data Fig. 10 |. Retro-seq and comparison of cell types across regions.
a, b, Injection targets for retro-seq, represented by select Allen Reference Atlas images are displayed on the left. The plots on the right show injection targets in rows and cell types in columns for annotated retro-seq cells collected from the ALM (n = 1,152) (a) or VISp (n = 1,052) (b). Cell numbers are represented as discs, coloured according to detected cell types. Cell numbers from each target segregated by categories (based on broad type or virus injected) are shown to the right. We used three types of viral tracers expressing Cre: CAV2-Cre, rAAV2-retro-EF1a-Cre and RVΔGL-Cre, and injected them into a Cre-reporter line Ai14. For ALM experiments, we also injected rAAV2-retro-CAG-GFP or rAAV2-retro-CAG-tdT into wild-type mice. To ensure diverse coverage of projection neuron types, at least two virus types were used for most broad target regions (except for striatum and tectum for ALM, and tectum for VISp), as different viruses may display different tropisms. Cell types that were never isolated from the retrograde tracing experiments are shaded pink. Grey-hatched regions denote cells that may have been labelled unintentionally (but unavoidably) through the needle injection tract. For most subcortical injections into VISp-projection areas, the needle goes through the cortex, and some IT cells are labelled through the virus deposited along the needle tract. One exception is the injection into the superior colliculus for VISp experiments, in which we avoided cortical labelling by injecting at an angle through the cerebellum (Methods). Each injection target is labelled according to the centre of the corresponding injection site, however, neighbouring regions are often infected (Supplementary Table 5). Reference atlas abbreviations are as follows: ACA, anterior cingulate area; ALM-c, contralateral anterior lateral motor area; CP-c, contralateral caudoputamen; CTX, cortex; GRN, gigantocellular reticular nucleus; IRN, intermediate reticular nucleus; LD, lateral dorsal nucleus of the thalamus; LGd, dorsal lateral geniculate complex; LP, lateral posterior nucleus of the thalamus; MD, mediodorsal nucleus of the thalamus; MOp, primary motor area; MY, medulla; ORBl-c, contralateral orbital area, lateral part; P, pons; PARN, parvicellular reticular nucleus; PERI, perirhinal area; PF, parafascicular nucleus; PG, pontine grey; PRNc, pontine reticular nucleus dorsal part; RSP, retrospenial area; SC, superior colliculus; SCs, superior colliculus sensory related area; SSp, primary somatosensory area; SSs, supplementary somatosensory area; STR, striatum; TEC, tectum; TH, thalamus; VISp-c, contralateral primary visual area; ZI, zona incerta. c, Mapping of glutamatergic cells from ALM onto VISp glutamatergic cell types (grey arrows) using a random forest classifier trained on VISp types, and vice versa (blue-grey arrows; Methods). The fraction of cells that mapped with high confidence onto clusters from the other region is represented by the weight of the arrows. The best matched types were used in Fig. 2c. For this comparison, the 4,519 ALM cells and 7,352 VISp cells from glutamatergic types excluding CR-Lhx5 were used. d, e, RNA ISH from the Allen Mouse Brain Atlas for select markers confirms areal gene expression specificity. Images contain regions of interest from representative sections selected from individual whole-brain RNA ISH experiments. The number of whole-brain experiments per gene available in the Allen Brain Atlas is as follows: Wnt7b: n = 3 brains (1 sagittal, 2 coronal); Postn: n = 2 brains (1 sagittal, 1 coronal); Rxfp2: n = 2 brains (1 sagittal, 1 coronal); Chrna6: n = 3 brains (1 sagittal, 2 coronal); Stac: n = 2 brains (1 sagittal, 1 coronal); Scnn1a: n = 2 brains (1 sagittal, 1 coronal). Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).
Extended Data Fig. 11 |
Extended Data Fig. 11 |. Validation of glutamatergic marker gene expression and cell type location by RNA FISH and projection subclasses by anterograde tracing.
a-c, Single-molecule RNA FISH with RNAscope was used to validate marker expression and cell type distribution. a, Example image shows fluorescent spots that correspond to Rspo1 (cyan), Hsd11b1 (green) and Scnn1a (red) mRNA molecules in a 10-μm coronal VISp section. Scale bars are in micrometres. b, Example of processed data from a; white square in a corresponds to green square in b. Data processing steps involved creation of maximum projection of a montage of confocal z-stacks, identifying nuclei, quantifying the number of fluorescent spots, assignment of spots to each nucleus by CellProfiler and horizontal compression of the data to emphasize layer enrichment of examined cells. Each dot in the panel represents a cell plotted according to the detected nucleus position. Each cell was coloured according to the quantified fluorescent labelling. The first three panels show cells shaded according to the quantified number of spots per cell: Rspo1 and Scnn1a mRNAs are enriched in L4 and Hsd11b1 at the L4-L5 border. The fourth panel shows location of cells co-labelled with two or more probes and confirms scRNA-seq data: coexpression of Rspo1 and Scnn1a is expected in the L4-IT-VISp-Rspo1 type and coexpression of Hsd11b1 and Scnn1a in the L5-IT-VISp-Hsd11b1-Endou type. c, Condensed plots for six individual representative RNA scope experiments (Exp) in VISp for select glutamatergic cell type markers. The number of times (n) each experiment was performed independently to produce similar results is listed below each experiment. Layers were delineated based on cell density. Data in a and b correspond to Exp1. d, A schematic of laminar distributions of VISp glutamatergic types according to experiments in c corroborates previous evidence showing that L5 IT and L5 pyramidal tract cells are not well separated into 5a and 5b sublayers in the visual cortex, compared to the primary somatosensory cortex. Note that in ALM, even subtypes of L5b with different projections are well segregated into upper and lower sublayers (see accompanying study). e, To confirm the projection patterns of several transcriptomic types and examine them in greater detail, we performed anterograde tracing by Cre- dependent adeno-associated virus (AAV) in select Cre lines. We have previously characterized cell type labelling by these Cre lines (Extended Data Fig. 8). In one case, we used a Cre line with a similar pattern of expression with viral reporter in adulthood (Cux2-IRES-cre instead of Cux2-IRES-creERT2). f-k, Each image is a projection generated from a series of images obtained by TissueCyte 1000 from a representative anterograde tracing experiment; additional experiments are available at http://connectivity.brain-map.org/. f, L5 and L2/3 IT types as labelled in Tlx3-cre_PL56, and Cux2-IRES-cre lines display extensive long-range projections that cover all layers with preference for upper layers. h, By contrast, L6 IT types, labelled by the newly generated Penk-IRES2-cre-neo line (Extended Data Fig. 8), project to many of the same areas as the L2-L3 and L5 IT types, but their projections are confined to lower layers in all areas examined, including higher visual areas and contralateral VISp and ALM. i, As revealed by the retro-seq data (Extended Data Fig. 10b), we confirm that L4-IT-VISp-Rspo1 type, which represents most cells labelled by Nr5a1-cre (Extended Data Fig. 8) projects to contralateral VISp (Fig. 3d). Notably, this projection is observed only when injection is performed in the most anterior portion of VISp (compare left and right panels in i). j, L6b types labelled by Ctgf-2A-dgcre have sparse projections to the anterior cingulate area. k, Consistent with the retro-seq data, the near-projecting types labelled by Slc17a8-IRES2-cre (Extended Data Fig. 8) do not have long-distance projections, but only local projections and sparse projections to nearby areas. Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).
Extended Data Fig. 12 |
Extended Data Fig. 12 |. Mapping of previously published scRNA-seq samples and Patch-seq samples to our dataset.
scRNA-seq data obtained from sorted cells or cell content extracted by patching were mapped to our transcriptomic types using a centroid classifier (Methods). a, River plot showing the mapping of single-cell transcriptomes described previously (n = 584 cells) to our types. b, Alternative representation of the results in a, with blue discs representing the number of single-cell transcriptomes published previously onto a dendrogram of GABAergic cell types in this study. Each blue disc area represents the total number of single-cell transcriptomes mapped to one of our cell types. c, Expression of Calb2, Vip and Cck in single cells from our Sncg and Vip subclasses (n = 3,225). Transgenic recombinase lines based on these genes were used to label CCK basket cells (CCKC) and interneuron-selective cells (ISC) described previously. Boxes highlight our types to which the CCK basket cells and interneuron-selective cells described previously were mapped to. CCK basket cells and interneuron-selective cells as defined previously each correspond to several of our transcriptomics types. d, Patch-seq data for 58 cells described previously were mapped to our transcriptomic types. Some cells could not be mapped with high confidence to terminal leaves of our taxonomy, and were therefore mapped to an internal node (cluster labels on the right that start with ‘n’ for node, see f). e, Constellation diagram showing corresponding types described previously and Lamp5 cell types from this study. Correspondences with the neurogliaform cells (NGC, orange) and single-bouquet cells (SBC, blue) defined previously are shown by the colours applied to the left side of each disc. f, Alternative representation of the result in d, with blue discs representing the number of single cell transcriptomes described previously mapped onto a dendrogram of GABAergic cell types in this study. Each blue disc area represents the total number of single cell transcriptomes mapped to a type (terminal leaf) or node in our taxonomy.
Extended Data Fig. 13 |
Extended Data Fig. 13 |. A new tool, Vipr2-IRES2-Cre, for access to select transcriptomically defined cell types.
a, Expression of select marker genes in our transcriptomically defined cell types (colour bar on top) represented as violin plots for all cluster-assigned cells (n = 23,822 cells; 133 clusters). Median values are black dots. Each row is scaled to the maximum expression value shown to the right of the plot (in CPM), and displayed on a linear scale. Venn diagrams represent expected cell type labelling by genetic tools described below. A new transgenic line, Vipr2-IRES2-cre, was created to label Pvalb-Vipr2 and Meis2 types. Unlike the previously developed Nkx2.1-creERT2 line, this line does not require tamoxifen induction for chandelier cell labelling (corresponding to Pvalb-Vipr2). b-f, Specificity of this recombinase line was tested by scRNA-seq and immunohistochemistry. b, scRNA-seq and clustering with other cells revealed cell types labelled by two mouse genotypes on the right. Types are labelled on the bottom in standard colours. Only cell types with at least one cell labelled are displayed. n = 329 cells from Vipr2-IRES2-cre/wt;Ai14/wt; n = 38 cells from Vipr2-IRES2-cre/wt;Slc32a1-T2A-FlpO/wt;Ai65/wt. c-f, Representative images of immunohistochemistry results; each image is representative of n = 2 experimental animals with stated genotypes. High- magnification images show tdT labelling in black, anti-PVALB in green and anti-tdTomato in red. Tissue sections (100 μm) were stained with anti- PVALB, anti-dsRed (labels tdTomato), and DAPI. Images are maximum intensity projections of confocal z-stacks. Scale bars are in micrometres. In Vipr2-IRES2-cre/wt;Ai14/wt mice (b, grey bars), apart from the expected labelling of chandelier cells (Pvalb-Vipr2), many non-neuronal cells are labelled (especially, Astro-Aqp4 and SMC-Aoc3 types, panel c). d, To improve labelling specificity, we created and examined Vipr2-IRES2-cre/wt;Slc32a1-T2A-FlpO/wt;Ai65/wt mice. As expected, labelling was more specific, now confined to chandelier cells, basket cells and Meis2 interneurons (dark bars in b and morphologically identified types in d). Notably, the chandelier cells within VISp did not express PVALB protein (d, panel 1). e, Genetic intersection of Vipr2 and Pvalb expression labelled cells with chandelier morphology corresponding to Pvalb-Vipr2 in ALM. f, However, Vipr2-IRES2-cre/wt;Pvalb-T2A-FlpO/wt;Ai65/wt did not label chandelier cells in VISp, but PVALB+ cells of basket morphology. This unexpected labelling may reflect historical expression of Vipr2 in a subset of other Pvalb cells or low adult Vipr2 expression that is not detected by scRNA-seq. In VISp-containing sections, some chandelier cells are labelled, but are observed outside of VISp (f, panel 3). RSP, retrosplenial cortex.
Extended Data Fig. 14 |
Extended Data Fig. 14 |. Discreteness and continuity in cell type definition.
a, Within the L4-IT-VISp-Rspo1 type, n = 1,442 cells were arranged according to graded expression of 35 genes (only 26 are shown). b, Mapping of n = 394 L4 and L5 IT cells from our previous study to cell types in this study. The three L4 clusters from our previous study map primarily to the L4-IT-VISp-Rspo1 type. c, Position of L4-IT-VISp-Rspo1 within the VISp glutamatergic constellation diagram. d, Computational split of the L4-IT-VISp-Rspo1 type into three parts along the continuum. For comparison, equivalent plots indicate better separation between this cluster and select L5 IT clusters. n = 1,404 cells for L4-IT-VISp-Rspo1, 215 for L5-IT-Hsd11b1-Endou, and 435 for L5-IT-VISp-Batf3. e, Comparison of gene expression differences among the parts of L4-IT-VISp-Rspo1, as well as this type with select L5 IT types. By this measure, similar differences are detected between the two ‘end’ parts of L4-IT-VISp-Rspo1, and between L4-IT-VISp-Rspo1 and L5-IT-Hsd11b1-Endou. f, Constellation diagrams for n = 2,880 Sst subclass cells reflect clusters defined at three different deScores corresponding to clustering stringency: low, standard and high, starting with the same number of genes (n = 30,862, Methods). For low stringency, newly split clusters are enclosed by dashed contours. For high stringency, arrows indicate dominant merges.
Extended Data Fig. 15 |
Extended Data Fig. 15 |. Cell types with activity-dependent transcriptomic signatures.
a, River plots representing the mapping of 14,205 VISp cells from this study to a previously published dataset using a centroid classifier (Methods). b, Violin plots show the distribution of log2-scaled and centred average expression of late-response genes (LRGs) and early-response genes (ERGs) in select cell types from this study that are in a subclass with at least one type that is significantly correlated with LRG or ERG expression. From the published dataset, only the clusters with expression of at least four ERGs and LRGs were included. In total, values for n = 6,956 cells from our study are displayed in the violin plots. We performed a two-sided t-test to assess enrichment or depletion of expression of LRGs or ERGs, and defined significant values as P < 0.01 after correction for multiple hypotheses using the Holm method, and average fold change greater than 2. Significant P values for enrichment and depletion are displayed above and below each row, respectively. Complete statistics for all t-tests is included in Supplementary Table 11. c, Heat maps of log2-scaled and centred average gene expression for ERGs and LRGs in select cell types from our study and from the published dataset.
Extended Data Fig. 16 |
Extended Data Fig. 16 |. Areal gene expression differences in GABAergic types.
a-c, t-SNE plots for Sst and Pvalb (n = 5,113 cells, generated using 1,244 differentially expressed genes), Lamp5, Serpinf1, Sncg and Vip (n = 5,365 cells, generated using 1,184 differentially expressed genes) and glutamatergic types (n = 11,905 cells, generated using 1,984 differentially expressed genes) showing cells labelled in cluster colours on the left and area-of-origin on the right. Glutamatergic cells show the most marked segregation by area of origin. Sst and Pvalb types show small but noticeable area-specific segregation, which is even less obvious for the Lamp5, Sncg and Vip types. d, Areas from the Sst and Pvalb t-SNE plots were enlarged to show partial segregation of cells within two Sst types by area of origin. e, Layer distribution and violin plots for marker genes Sst, Crh and Calb2 in cells from the same transcriptomic type divided by area. The number of cells in each type and region are shown below each column (n = 118 cells for ALM Sst-Calb2-Pdlim5; 120 for VISp Sst-Calb2-Pdlim5; 62 for ALM Sst-Esm1; and 69 for VISp Sst-Esm1). Violin plots are shown on a logi0 scale, scaled to the maximum value for each gene (right of the plot in CPM); black dots are medians. In ALM, triple positive Sst+Crh+Calb2+ cells belong to Sst-Calb2-Pdlim5 type and are enriched in upper layers. In VISp, this same type does not express Crh, but a different type may account for Sst+Crh+Calb2+ cells: Sst-Esm1, and would be expected in lower layers. f, g, RNA FISH by RNAscope for Sst (green), Crh (red) and Calb2 (cyan) in ALM and VISp. Scale bars are in micrometres. In agreement with e, we find triple-positive cells by RNA FISH in ALM in upper layers, and in VISp in lower layers. Images are representative of a single experiment including multiple tissue sections from ALM and VISp. Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index. html).
Fig. 1 |
Fig. 1 |. Cell type taxonomy in ALM and VISp cortical areas.
a, Transgenically or retrogradely labelled cells and unlabelled cells were collected by layer-enriching or all-layer microdissections from the ALM or VISp. b, After dissociation, single cells were isolated by FACS or manual picking, mRNA was reverse transcribed (RT), amplified (cDNA amp.), tagmented and sequenced (next-generation sequencing, NGS). c, Clustering revealed 61 GABAergic, 56 glutamatergic, and 16 non-neuronal types organized in a taxonomy on the basis of median cluster expression for 4,020 differentially expressed genes, n = 23,822 cells and branch confidence scores > 0.4 (Extended Data Figs. 1–3). Cell classes and subclasses are labelled at branch points of the dendrogram. Bar plots represent fractions of cells dissected from the ALM and VISp, and from different layer-enriching dissections. Astro, astrocyte; CR, Cajal-Retzius cell; endo, endothelial cell; oligo, oligodendrocyte; OPC, oligodendrocyte precursor cell; peri, pericyte; PVM, perivascular macrophage; SMC, smooth muscle cell; VLMC, vascular lepotomeningeal cell; IT, intratelencephalic; PT, pyramidal tract; NP, near-projecting; CT, corticothalamic. Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).
Fig. 2 |
Fig. 2 |. Comparison of gene expression differences among types across cortical areas.
a, Two-dimensional t-distributed stochastic neighbour embedding (t-SNE) plots based on 4,020 differentially expressed genes for n = 23,822 cells, coloured by region, class and cluster. Most glutamatergic types are ALM- or VISp-specific. Most GABAergic types contain cells from both regions (salt-and-pepper clusters, left t-SNE). b, Number of differentially expressed (DE) genes (x axis) and mean difference in gene expression (y axis) for all 8,778 pairs of clusters. Left, comparisons between ALM and VISp portions of each GABAergic cluster (pink) and best- matched glutamatergic ALM and VISp clusters (blue). For comparison, centre and right panels show differences between: types within a subclass, types from different subclasses, non-neuronal types, types from different neuronal classes (GABA versus glutamate), and neuronal and nonneuronal types. Grey points represent all pairwise type comparisons; pink points are only in the left panel. c, Number of differentially expressed genes between best-matched ALM- and VISp-specific cell types (Extended Data Fig. 10c) or ALM and VISp portions for shared types. Cell types on the x axis are coloured as in Fig. 1; black horizontal line separates matched ALM and VISp types, but not the shared types. Black and grey bars denote the numbers of ALM- and VISp-enriched genes, respectively. d, ALM- or VISp-specific genes based on the proportion of cells in each region that express each gene, calculated separately for glutamatergic and GABAergic cells.
Fig. 3 |
Fig. 3 |. Glutamatergic cell types by scRNA-seq and projections.
a, Retro-seq: after virus injections and brain sectioning, injection sites were imaged to determine injection specificity. Tissue was microdissected from the collection site (ALM or VISp) and processed as shown in Fig. 1b. b, Injection targets grouped into broad regions: cortex (CTX), striatum (STR), thalamus (TH), tectum (TEC), pons (P) or medulla (MY). c, Dendrogram of glutamatergic cell types in ALM followed by numbers of cells (represented by disc area) originating from retrograde labelling from regions on top. Shaded regions denote cells labelled unintentionally, directly or retrogradely through the needle (injection) tract. d, As in c, but for VISp. Only glutamatergic cells from the annotated retro-seq dataset were included: n = 1,138 out of 1,152 annotated cells in c, and 1,049 out of 1,052 annotated cells in d. See Extended Data Fig. 10a, b for further details. Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).
Fig. 4 |
Fig. 4 |. Glutamatergic cell types and markers.
a, Constellation diagram of ALM and VISp types. Disc areas represent core cell numbers for each cluster (n = 10,729), edge weights represent intermediate cell numbers (n = 1,136). L6-CT-Nxph2-Sla, L6b-Col8a1-Rprm, L6b-Hsd17b2 and L6b-P2ry12 are found in both areas. Cajal-Retzius type was omitted. b, Dendrograms correspond to glutamatergic portion of Fig. 1c. Layer distribution for each type was inferred from layer-enriching dissections (n = 8,477 out of 11,871 cells in glutamatergic clusters): each dot represents a cell positioned at random within each layer. Distributions are approximate owing to sampling strategy (Methods). c, Marker gene expression distributions within each cluster are represented by violin plots. Rows are genes, black dots are medians. Values within each row are normalized between 0 and maximum detected, displayed on a log10 scale (n = 11,827 cells).
Fig. 5 |
Fig. 5 |. GABAergic cell types by scRNA-seq.
a, b, Constellation diagrams for Sst and Pvalb (a) and Lamp5, Serpinfl, Sncgand Vip (b) types, as in Fig. 4a (n = 9,021 core cells; n = 1,457 intermediate cells). Edges connecting subclasses are pink. Meis2 type was omitted. c, d, Dendrograms are portions of Fig. 1c focused on the main GABAergic branch. Below the dendrograms, layer distribution for each type was inferred as in Fig. 4b; only cells from single-layer dissections were used: n = 4,675 out of 5,365 cells in c, and 3,908 out of 5,113 cells in d. Distributions are approximate owing to the sampling strategy (Methods). e, f, Marker gene expression distributions within each cluster are represented by violin plots as in Fig. 4c. n = 5,365 cells in e and 5,113 cells in f.

Comment in

References

    1. Fuster J The Prefrontal Cortex 5th edn (Academic Press, Cambridge, MA, 2015).
    1. Mountcastle VB Perceptual Neuroscience: The Cerebral Cortex (Harvard Univ. Press, Cambridge, MA, 1998).
    1. DeFelipe J The evolution of the brain, the human nature of cortical circuits, and intellectual creativity. Front. Neuroanat 5, 29 (2011). - PMC - PubMed
    1. Glasser MF et al. A multi-modal parcellation of human cerebral cortex. Nature 536, 171–178 (2016). - PMC - PubMed
    1. Kolb B & Tees RC The Cerebral Cortex of the Rat (MIT Press, Cambridge, MA, 1990).

Publication types

MeSH terms