Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun;27(6):1075-1086.
doi: 10.1038/s41593-024-01624-4. Epub 2024 Apr 22.

Cortical gene expression architecture links healthy neurodevelopment to the imaging, transcriptomics and genetics of autism and schizophrenia

Affiliations

Cortical gene expression architecture links healthy neurodevelopment to the imaging, transcriptomics and genetics of autism and schizophrenia

Richard Dear et al. Nat Neurosci. 2024 Jun.

Abstract

Human brain organization involves the coordinated expression of thousands of genes. For example, the first principal component (C1) of cortical transcription identifies a hierarchy from sensorimotor to association regions. In this study, optimized processing of the Allen Human Brain Atlas revealed two new components of cortical gene expression architecture, C2 and C3, which are distinctively enriched for neuronal, metabolic and immune processes, specific cell types and cytoarchitectonics, and genetic variants associated with intelligence. Using additional datasets (PsychENCODE, Allen Cell Atlas and BrainSpan), we found that C1-C3 represent generalizable transcriptional programs that are coordinated within cells and differentially phased during fetal and postnatal development. Autism spectrum disorder and schizophrenia were specifically associated with C1/C2 and C3, respectively, across neuroimaging, differential expression and genome-wide association studies. Evidence converged especially in support of C3 as a normative transcriptional program for adolescent brain development, which can lead to atypical supragranular cortical connectivity in people at high genetic risk for schizophrenia.

PubMed Disclaimer

Conflict of interest statement

K.M.A. is an employee of Neumora Therapeutics. R.D.M. is an employee of Octave Biosciences. E.T.B. has consulted for Boehringer Ingelheim, SR One, GlaxoSmithKline, Sosei Heptares and Monument Therapeutics. All other authors have no disclosures to make.

Figures

Fig. 1
Fig. 1. Three generalizable components of human cortical gene expression were enriched for biological processes, cytoarchitecture and cognitive capacity.
a, To identify robust components of cortical gene expression, we split the six-brain AHBA dataset into two disjoint triplets of three brains, applied PCA to each triplet and correlated the resulting matched components (C1, C2, C3…) (Methods). For each component, the median absolute correlation over all 10 permutations of triplet pairs was a proxy for its generalizability, g. Using PCA and previously published best practices for processing the AHBA dataset,, generalizability decreased markedly beyond the first component: gC1 = 0.78, gC2 = 0.09, gC3 = 0.14. Using DME with the top 50% most stable genes, and the 137 regions with data available from at least three brains, the generalizability of the first three components substantially increased: gC1 = 0.97, gC2 = 0.72, gC3 = 0.65. b, Cortical maps of brain regional scores of components C1–C3 estimated by DME on the filtered AHBA dataset displayed smooth spatial gradients (right; Moran’s I 0.48, 0.58 and 0.21 for C1–C3, respectively), unlike those of PCA on the unfiltered data (left; Moran’s I 0.50, 0.09 and 0.07). c, GO biological process enrichments for C1–C3 showed that the number of significant enrichments was greater for higher-order components, illustrating that they were more biologically specific. C2-positive genes were enriched for metabolism, whereas C2-negative genes were enriched for regulatory processes. C3-positive genes were enriched for synaptic plasticity and learning, whereas C3-negative genes were enriched for immune processes. d, C1–C3 were distinctively enriched for marker genes of six cortical layers and white matter (WM). e, C1–C3 were also distinctively enriched for marker genes of cell types and synapses. f, All three components were significantly enriched for genes mapped to common variants associated with educational attainment in previous GWAS data. g, C2 and C3 (but not C1) were significantly enriched for genes mapped to common variation in intelligence and cognition across four independent GWAS studies. For dg, significance was computed by two-sided permutation tests (Methods) and FDR-corrected across all tests in each panel; *P < 0.05, **P < 0.01, ***P < 0.001 .
Fig. 2
Fig. 2. Neuroimaging and macroscale maps of brain structure, function and development were distinctively co-located with three components of cortical gene expression.
a, Correlation matrix of intrinsic transcriptional components C1–C3 together with the nine neuroimaging-derived and physiologically derived maps that Sydnor et al. combined with C1 to define S-A axis of brain organization. Many of the maps were not highly correlated to each other (median |r| = 0.31), and data-driven clustering of the matrix revealed three distinct clusters around each of the mutually orthogonal transcriptional components C1–C3, demonstrating that all three components are relevant for understanding macroscale brain organization. b, Distributions of regional scores of C1–C3 in histologically defined regions of laminar cytoarchitecture. C1 distinguished idiotypic (P = 0.005) and paralimbic (P = 0.002) regions, whereas C3 distinguished idiotypic (P = 0.002) and heteromodal (P = 0.01) regions. *P < 0.05, FDR-adjusted two-sided permutation test as the percentile of the mean z-score relative to null spin permutations, with adjustment for multiple comparisons across all 12 tests. c, Degree of fMRI connectivity, was significantly aligned to C1 (r = 0.78, Pspin < 0.001). Blue/yellow highlighted points correspond to idiotypic/paralimbic cytoarchitectural regions as in b. d, MEG-derived theta power was significantly aligned to C2 (r = 0.78, Pspin = 0.002). e, Regional change in myelination over adolescence, was significantly aligned to C3 (r = 0.43, Pspin = 0.009). Blue/red highlighted points correspond to idiotypic/heteromodal cytoarchitectural regions as in b. In c and d, *P < 0.05, **P < 0.01, ***P < 0.001, FDR-corrected two-sided spin-permutation test, with corrections for multiple comparisons of all maps in c and d being compared with all of C1–C3.
Fig. 3
Fig. 3. Transcriptional components represent intracellular coordination of gene expression programs with distinct developmental trajectories.
a, For each of approximately 50,000 single-cell RNA-seq samples, the weighted average expression of the negatively weighted genes of each AHBA component C1–C3 is plotted against that of the positively weighted genes (Methods). Samples are colored by cell type, demonstrating that genes positively and negatively weighted on C1–C3 have correlated expression within each major class of brain cells. Astro, astrocytes; Endo, endothelial cells; Micro, microglia; N-Ex, excitatory neurons; N-In, inhibitory neurons; Oligo, oligodendrocytes; OPC, oligodendrocyte precursor cell. Inset, a subset of samples from L2 VIP interneurons, illustrating that C1–C3 weighted genes were transcriptionally coupled even within a fine-grained, homogeneous group of cells. b, Cortical maps representing the regional scores of components C1–C3 for each of 11 regions with transcriptional data available in the BrainSpan cohort of adult brains (left) and C1–C3 component scores for the matching subset of regions in the AHBA (right). c, Scatter plots of matched regional C1–C3 scores from b, demonstrating that the three transcriptional components defined in the AHBA had consistent spatial expression in BrainSpan. d, Correlations between AHBA C1–C3 scores and BrainSpan C1–C3 scores (as in c) for each of three age-defined subsets of the BrainSpan dataset. C1 and C2 component scores were strongly correlated between datasets for all age subsets, whereas C3 component scores were strongly correlated between datasets only for the 18–40-year subset of BrainSpan. This indicates that C1 and C2 components were expressed in nearly adult forms from the earliest measured phases of brain development, whereas C3 was not expressed in adult form until after adolescence. e, Developmental trajectories of brain gene expression as a function of age (−0.5 years to 40 years; x axis, log scale) were estimated for each gene (Methods) and then averaged within each decile of gene weights for each of C1–C3; fitted lines are color-coded by decile. Genes weighted positively on C3 were most strongly expressed during adolescence, whereas genes weighted strongly on C1 or C2 were most expressed in the first 5 years of life. Dots above the x axis represent the postmortem ages of the donor brains used to compute the curves. RPKM, reads per kilobase per million mapped reads.
Fig. 4
Fig. 4. Genetics, transcriptomics and neuroimaging of autism and schizophrenia were consistently and specifically linked to normative transcriptional programs.
a, First row: cortical volume shrinkage in ASD, MDD and schizophrenia (SCZ) cases. Red indicates greater shrinkage, computed as z-scores of centiles from normative modeling of more than 125,000 MRI scans. Second row: AHBA components projected into the same Desikan–Killiany parcellation. b, Spatial correlations between volume changes and AHBA components, C1–C3. Significance was tested by two-sided FDR-adjusted spatially autocorrelated spin permutations and corrected for multiple comparisons. c, Enrichments in C1–C3 for consensus lists of DEGs in postmortem brain tissue of donors with ASD, MDD and SCZ compared to healthy controls (Methods). Significance was assessed as percentile of mean weight of DEGs in each component relative to randomly permuted gene weights and corrected for multiple comparisons; two-sided FDR-adjusted P values. d, Enrichment in C1–C3 for GWAS risk genes for ASD, MDD and SCZ, tested for significance as in c, demonstrating alignment with both spatial associations to volume changes and enrichments for DEGs. e, Venn diagrams showing the lack of overlap of DEGs and GWAS risk genes reported by the primary studies summarized in c and d. f, DEGs and GWAS risk genes for each disorder were filtered for only C3-positive genes and then tested for enrichment with marker genes for each cortical layer. Significance was tested by one-sided Fisher’s exact test and corrected for multiple comparisons across all 42 tests. C3-positive DEGs and GWAS genes for SCZ (but not ASD or MDD) were both enriched for L2 and L3 marker genes, despite the DEGs and GWAS gene sets having nearly no overlap for each disorder (see Extended Data Fig. 6 for more detail). g, Convergent with L2/L3 enrichment in the C3-positive SCZ-associated DEGs and GWAS genes, a cortical map of supragranular-specific cortical thinning in SCZ was significantly and specifically co-located with C3 (r = 0.55, two-sided spin-permutation P = 0.002); each point is a region, and color represents C3 score. *P < 0.05, **P < 0.01, ***P < 0.001.
Extended Data Fig. 1
Extended Data Fig. 1. Optimised processing of the AHBA identified three generalisable components.
a, In the HCP-MMP parcellation, 43/180 regions are matched to samples representing less than 3 of the 6 AHBA donors. b, Distribution of differential stability of genes measured in the AHBA dataset processed in the HCP-MMP parcellation. c, Generalisability of first five components of the AHBA dataset computed with either principal components analysis (PCA) or diffusion map embedding (DME). Color represents generalisability g, defined as the median absolute correlation between matched components computed across all 10 disjoint triplet pairs (Methods); x-axis represents variation in the proportion of genes filtered out by differential stability prior to PCA/DME; y-axis represents variation in which regions are filtered out prior to PCA/DME. Tick mark indicates parameter combinations that exceed generalisability g > 0.6. Green highlights for C3 indicate the best parameter option with PCA and DME respectively, showing that switching to DME achieves similar generalisability while retaining more genes. d, Scatter plots of regional scores for AHBA components computed using the best PCA/DME options, demonstrating that PCA and DME derive spatially equivalent components.
Extended Data Fig. 2
Extended Data Fig. 2. Transcriptional components were robust to parcellation and processing.
Transcriptional components were computed in four different parcellation templates (Methods). For each parcellation, the gene weights for the first three components were correlated with the weights obtained from the HCP-MMP parcellation used throughout. Gene weights were highly consistent, although in the less-granular (34-regions/hemisphere) Desikan-Killiany parcellation, C2 and C3 were less well aligned to the other parcellations. b, A wide range of parameters for processing the AHBA data were varied, and the resulting component region scores were correlated with the components obtained from the optimised parameters. For nearly all variations in parameters, highly consistent components were obtained, demonstrating the robustness of C1-C3.
Extended Data Fig. 3
Extended Data Fig. 3. AHBA transcriptional components were reproducible in independent PsychENCODE control data, with differential spatial expression in autism.
a, Gene weights from dimension reduction applied to group-averaged bulk RNA-seq measurements from 11 cortical regions in N = 54 healthy control brains from the PsychENCODE dataset were correlated with gene weights from the components of the AHBA (derived by DME in the 180-region HCP-MMP parcellation), showing that the genetic profiles of AHBA C1, C2, and C3 were reproduced by PsychENCODE C1, C2, and C4, respectively (highlighted in green). b, Regional scores of PsychENCODE C1, C2 and C4 were also correlated with region scores of AHBA C1, C2 and C3, showing that the matching genetic profiles correspond to matching spatial expression patterns. c, Variance explained by the first five components of each dataset, showing that AHBA C3 and PsychENCODE C4 account for similar proportions of variance (6.5% and 7.1%, respectively). d, 1st row: Cortical maps of AHBA C1-C3 in the same 11 regions sampled in the PsychENCODE data. 2nd row: Cortical maps of PsychENCODE C1, C2, and C4 demonstrating their spatial similarity to AHBA C1-C3. 3rd row: Gene weights from the PsychENCODE healthy control data were projected onto transcriptional data of cases with autism spectrum disorder (ASD; N = 58) from the same dataset, demonstrating lower regional expression at the positive (red) pole of each component in the ASD cases compared to healthy controls. e, Distributions of regional scores for C1, C2 and C4, computed on group-average healthy controls as in a-d and projected to individual donor brains in the PsychENCODE dataset, demonstrating significant case-control differential expression for regions at the positive poles of C1-C3. T-tests of case-control differences were corrected for multiple comparisons across all 33 tests; boxplots represent the median, first, and third quartiles with whiskers showing 1.5 * inter-quartile range; *, **, *** indicate FDR-corrected two-sided p-value < 0.05, 0.01, 0.001 respectively. Region names refer to the sampled Brodmann Areas (BA): Visual = BA17, Temporal Pole = BA38, Somatosensory = BA3-1-2-5, Motor = BA4-6, Anterior Cingulate = BA24, Prefrontal = BA9, Broca′s Area = BA44-45, Fusiform Gyrus = BA20-37, Auditory = BA41-42-22, Lateral Parietal = BA39-40, Dorsal Parietal = BA7.
Extended Data Fig. 4
Extended Data Fig. 4. Higher-order components of cortical gene expression reflect anatomically relevant co-expression structure.
a, C1-C3 were orthogonally aligned in anatomical space, as computed by the Pearson’s correlations of the regional scores with the XYZ coordinates of the region centroids: C1 and C2 were both aligned with the anterior-to-posterior (y) and ventral-to-dorsal (z) plane, but with opposite signs along the anterior-to-posterior axis, while only C3 was aligned to the medial-lateral (x) axis. The middle panel represents these alignments as vectors in 3D space. The right-hand upper table shows the correlations of C1-C3 with each anatomical axis, and the lower table shows the angle in degrees between the vectors, showing that C1-C3 are orthogonal. b, Co-expression matrices computed by Pearson’s correlations of gene expression between brain regions, computed with and without regressing out the first component C1, and annotated by the major cortical lobes as defined in the HCP-MMP parcellation. This further demonstrates that the gene co-expression structure captured by C2 and C3 (that is, the residual variation beyond C1) is anatomically relevant.
Extended Data Fig. 5
Extended Data Fig. 5. Transcriptional components were distinctively associated with the regional power of canonical brain oscillation frequencies.
Several MEG power bands were highly correlated (|r|>0.6) with C1 (delta, alpha, high-gamma) and C2 (beta, theta), although only the theta association to C2 survived FDR correction of the spin-test p-values (r = 0.78, FDRspin = 0.05). No MEG band was aligned with C3.
Extended Data Fig. 6
Extended Data Fig. 6. C3 reveals shared biology across inconsistent postmortem brain RNA-seq studies of differentially expressed genes (DEGs) in schizophrenia.
a, Euler diagram demonstrating the relative lack of overlap of genes linked to schizophrenia in four independent RNA-seq postmortem brain studies, as well as the latest GWAS study. b, Histogram of the schizophrenia GWAS and consensus DEG genes by C3 decile. The skew of the histograms towards higher C3 deciles reflects the significant enrichment of both non-overlapping gene sets, as in Fig. 4c,d. c, Histograms of the schizophrenia GWAS and DEG genes from each separate study by C3 decile, coloured by cortical layer where the gene was identified as a marker gene. L2 genes are distinctly clustered towards the C3+ pole, while L1 and WM genes are clustered towards C3-. d, For schizophrenia and ASD, enrichments of the GWAS/DEG genes from each separate study for marker genes of cortical layers, showing that no consistent significant enrichments are found across the entire gene sets for studies of either disorder. e, Enrichments as in d, except for only genes positively weighted in C3 (corresponding to the right-hand five deciles of each histogram in panel c). For schizophrenia, significant enrichments for L2 and L3 are observed for three of the four DEG studies, as well as the GWAS study. No such enrichments were observed for ASD, demonstrating that C3 reveals convergent biology across otherwise inconsistent results specifically for schizophrenia. Significance was tested by one-sided Fisher’s exact test and corrected for multiple comparisons across all tests in each panel. *, **, *** indicate FDR-corrected one-sided p-value < 0.05, 0.01, 0.001 respectively.

References

    1. van den Heuvel MP, Yeo BTT. A spotlight on bridging microscale and macroscale human brain architecture. Neuron. 2017;93:1248–1251. - PubMed
    1. Hawrylycz MJ, et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature. 2012;489:391–399. - PMC - PubMed
    1. Oldham MC, et al. Functional organization of the transcriptome in human brain. Nat. Neurosci. 2008;11:1271–1282. - PMC - PubMed
    1. Kang HJ, et al. Spatio-temporal transcriptome of the human brain. Nature. 2011;478:483–489. - PMC - PubMed
    1. Ayoub AE, et al. Transcriptional programs in transient embryonic zones of the cerebral cortex defined by high-resolution mRNA sequencing. Proc. Natl Acad. Sci. USA. 2011;108:14950–14955. - PMC - PubMed