Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 24;384(6698):eadh0829.
doi: 10.1126/science.adh0829. Epub 2024 May 24.

Cross-ancestry atlas of gene, isoform, and splicing regulation in the developing human brain

Collaborators, Affiliations

Cross-ancestry atlas of gene, isoform, and splicing regulation in the developing human brain

Cindy Wen et al. Science. .

Abstract

Neuropsychiatric genome-wide association studies (GWASs), including those for autism spectrum disorder and schizophrenia, show strong enrichment for regulatory elements in the developing brain. However, prioritizing risk genes and mechanisms is challenging without a unified regulatory atlas. Across 672 diverse developing human brains, we identified 15,752 genes harboring gene, isoform, and/or splicing quantitative trait loci, mapping 3739 to cellular contexts. Gene expression heritability drops during development, likely reflecting both increasing cellular heterogeneity and the intrinsic properties of neuronal maturation. Isoform-level regulation, particularly in the second trimester, mediated the largest proportion of GWAS heritability. Through colocalization, we prioritized mechanisms for about 60% of GWAS loci across five disorders, exceeding adult brain findings. Finally, we contextualized results within gene and isoform coexpression networks, revealing the comprehensive landscape of transcriptome regulation in development and disease.

PubMed Disclaimer

Conflict of interest statement

This article was prepared while M.A.P. was employed at Sage Bionetworks. M.J.G. and D.H.G. receive grant funding from Mitsubishi Tanabe Pharma America. Z.W. cofounded Rgenta Therapeutics and serves as a scientific adviser for the company and a member of its board. The remaining authors declare tno competing interests. The opinions expressed in this article are the authors’ own and do not reflect the view of the National Institute on Aging, the National Institutes of Health, the Department of Health and Human Services, or the US Government.

Figures

Fig. 1.
Fig. 1.. The landscape of gene, splicing, and isoform regulation in the developing human brain.
(A) Number of eGenes versus sample size discovered here compared with other human brain studies (–, , –27). (B) Overlap of eGenes between developing brain (N = 629), GTEx v8 Brain Cortex (N = 205), and PsychENCODE adult brain studies (N = 1387). (C) Correlation of eQTL effect size, measured by allelic fold change (aFC), between developing and adult brain datasets. Each dot is a shared eGene-primary eQTL pair between the developing brain and GTEx (247 pairs) or PsychENCODE (253 pairs) datasets. (D) Overlap among eGenes, isoGenes, and sGenes. (E) Distance between the TSS of each target gene of cis-eQTL, cis-isoQTL, and cis-sQTL SNPs. (F) Enrichment of cis-eQTL, cis-isoQTL, and cis-sQTLs within functional regions of the genome. (G) Loss-of-function mutation intolerance, as measured by pLI score, of eGenes, isoGenes, and sGenes. isoGenes and sGenes exhibit significantly less tolerance to loss-of-function mutations than eGenes (Wilcoxon). (H) Storey’s pi1 statistic of the proportion of true associations in the discovery group of QTL (y axis; permutation q value < 0.05) that are also true associations in the replication group of QTL (x axis; all nominal P values). (I) Number of fine-mapped credible sets versus number of conditionally independent eQTLs discovered. The size of the dots is scaled to the number of genes. (J) Common recurrent inv-eQTLs in the developing brain. Inversions are displayed according to their length and the number of overlapping SNPs. Inversions with significantly associated eGenes have filled circles (FDR-adjusted P < 0.05). The size of the circle indicates the population frequency of the inversions.
Fig. 2.
Fig. 2.. Cross-ancestry gene regulation and fine-mapping.
(A) Genotype principal components analysis of the developing human brain samples. Sample ancestry was inferred by merging imputed genotypes with 1000 Genomes (47). (B) Comparison of eGenes discovered in the full cross-ancestry dataset (“ALL,” N = 629) and in the separate subancestries, EUR (N = 280), AMR (N = 162), and AFR (N = 135). (C) Correlation of eQTL effect sizes between AMR/AFR (top/bottom) and EUR, as measured by allelic fold change. Each dot is an AMR/AFR eGene-primary eQTL pair and is colored by its nominal significance in EUR. Gray lines denote the lower and upper bounds of aFC. (D) Comparison of fine-mapping credible set sizes between the ancestries. (E) Cis associations for the gene MTFR1 in the cross-ancestry, EUR, AMR, and AFR datasets.
Fig. 3.
Fig. 3.. Trimester-specific patterns of gene expression and splicing regulation.
(A) Comparison of eGenes and sGenes identified in Tri1, Tri2, and the full dataset. We identified many more eGenes and sGenes in Tri1 than in Tri2 despite similar sample sizes. (B) Example of a trimester-specific eGene for WARS2, where rs146862216 (G>A) is an eQTL in Tri1 (beta = −0.89, FDR = 3.88 × 10−13) but not in Tri2 (beta = −;0.03, P = 0.71). (C) cis-Heritability of gene expression drops from Tri1 to Tri2 time points, as well as between developing and adult (PsychENCODE) samples (left). A similar drop is observed in splicing heritability between Tri1 and Tri2 (right). (D) Sliding-window analysis of gene expression (left) and splicing (right) cis-heritability for samples from 10 to 18 weeks. Each dot represents a sliding set of temporally ordered samples (N = 150), with mean age (±SD) on the x axis and median cis-h2SNP (±SD) on the y axis. (E) Comparison of gene biotype enrichment of Tri1-only and Tri2-only eGenes and sGenes. Values associated with each gene type represent the proportion of genes classified within that category. Red boxes highlight significant post hoc P value. (F) Estimated proportion of seven major cell classes over development assessed by bulk tissue cell-type deconvolution using CIBERSORTx (53).
Fig. 4.
Fig. 4.. Integrative analysis of xQTLs with neuropsychiatric GWAS results.
(A) Quantile-quantile plot of SCZ GWAS P values, subsetted by top cis-eQTLs, cis-isoQTLs, and cis-sQTLs compared with all background GWAS SNPs. (B) S-LDSC enrichment of SCZ GWAS heritability within developing brain xQTLs and adult brain cortex eQTLs (GTEx v8) compared with background functional annotations. The proportional genomic coverage of SNPs within each annotation is shown in parentheses. (C) Estimated proportion (±SE) of GWAS h2SNP mediated by the cis genetic component of gene, isoform, and intron (splicing) regulation. isoQTLs mediate the greatest degree of heritability for multiple neuropsychiatric traits in the developing brain compared with eQTLs and sQTLs. (D) Estimated proportion of GWAS h2SNP mediated by the cis genetic component of trimester-stratified gene, isoform, and intron (splicing) regulation. For (B), (C), and (D), ***FDR < 0.001, **FDR < 0.01, and *FDR < 0.05.
Fig. 5.
Fig. 5.. Neuropsychiatric risk gene prioritization through colocalization and isoTWASs.
(A) Total number of GWAS loci exhibiting significant colocalization (CLPP > 0.01 or PP4 > 0.7) with specific developing brain xQTL annotations. (B) Colocalization between SCZ GWAS and developing brain xQTLs ranked by CLPP and grouped by GWAS locus, as indicated by the index SNP to the right. Only top colocalization results for protein-coding genes at genome-wide significant loci are shown. (C) Top: locus plots of SCZ GWAS with SP4 eQTLs and sQTLs. A significant colocalization (CLPP = 0.01) is observed for SCZ GWAS with a cryptic splicing event in SP4. SP4 does not have a detectable eQTL in the developing brain. Middle: Gene structure of SP4 with and without cryptic exon inclusion, likely resulting in nonsense-mediated decay. Bottom left: sashimi plot showing the density of exon and junction read mapping for intron cluster chr7:21516925-21521542 stratified by the colocalized sQTL. Bottom right: sQTL rs10276352 (G>A) increases the contribution to cluster of annotated intron chr:21516925-21521542. The SCZ risk allele increases cryptic exon inclusion. (D) Developing brain isoTWAS associations with SCZ GWASs. Each dot represents an isoform, and adjusted P < 0.05 isoforms are shown in red. Genes of fine-mapped isoforms near a GWAS locus are labeled.
Fig. 6.
Fig. 6.. Systems-level integration of risk variation with developmental gene and isoform coexpression.
(A) Workflow for construction of gene- and isoform-level coexpression networks, followed by cell-type, pathway, and disease gene enrichment analyses. Separate gene coexpression networks were built to capture trimester- and sex-specific effects. (B) Top, hierarchical clustering of modules from gene, isoform, trimester, and sex-stratified networks through biweight midcorrelation of eigengenes. Middle, heatmaps depicting module-level enrichment for neuropsychiatric GWAS signal (−log10Penrich from S-LDSC and MAGMA) and ORs for rare variation and cell-type enrichment (truncated at 10). Triangles indicate FDR-corrected P < 0.1 significance. (C) Annotations for M2, a development-wide disease-associated chromatin regulation module. Center: top module (“hub”) genes with circle size reflecting module membership (kME) and orange shading indicating genes with associated high-confidence neuropsychiatric disorder–associated rare variants. Thin edges represent topologic overlap, and solid edges indicate protein-protein interactions from the STRING database. Surrounding: circular bar plot highlighting module enrichment for cell types (purple), common (red) and rare (orange) variation, GO terms (dark green), and module overlap (light green). (D) Annotations for M59, an ADHD-associated mitochondrial/proteasome isoform module.
Fig. 7.
Fig. 7.. Module-interacting eQTLs and context-specific GWAS colocalization.
(A) Hierarchical clustering of cis-eQTL enrichments among specific developing brain cell types as mapped by CellWalker. Outermost numbers denote results from single-cell-type label analysis. Inner numbers denote results from a broader, multilevel label analysis. (B) Schematic of module interaction ieQTL mapping and validation in cultured neurons and progenitors. (C) Results from module ieQTL mapping. From top to bottom, pi1 statistics depicting ieQTL overlap with eQTLs from cultured neurons or NPCs, molecular feature (gene or isoform), number of ieQTLs, and cell-type enrichment. A total of 62 modules with pi1 > 0.2 in either neurons or NPCs are shown. (D) Colocalization between SCZ GWAS and BRINP2 ieQTL rs17659437 in M93 (CLPP = 0.02). (E) Annotation for M93, a SCZ/BIP enriched deep-layer neuronal, synaptic module. (F) Trajectory of M93 eigengene expression across brain development colored by biological sex.

Update of

References

    1. Trubetskoy V et al., Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature 604, 502–508 (2022). doi: 10.1038/s41586-022-04434-5 - DOI - PMC - PubMed
    1. Grove J et al., Identification of common genetic risk variants for autism spectrum disorder. Nat. Genet 51, 431–444 (2019). doi: 10.1038/s41588-019-0344-8 - DOI - PMC - PubMed
    1. Gandal MJ, Leppa V, Won H, Parikshak NN, Geschwind DH, The road to precision psychiatry: Translating genetics into disease mechanisms. Nat. Neurosci 19, 1397–1407 (2016). doi: 10.1038/nn.4409 - DOI - PMC - PubMed
    1. Ward LD, Kellis M, Evidence of abundant purifying selection in humans for recently acquired regulatory functions. Science 337, 1675–1678 (2012). doi: 10.1126/science.1225057 - DOI - PMC - PubMed
    1. Maurano MT et al., Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012). doi: 10.1126/science.1222794 - DOI - PMC - PubMed

Publication types

Substances