Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2021 Oct 19;13(1):162.
doi: 10.1186/s13073-021-00980-1.

Comprehensive multi-omics integration identifies differentially active enhancers during human brain development with clinical relevance

Affiliations
Meta-Analysis

Comprehensive multi-omics integration identifies differentially active enhancers during human brain development with clinical relevance

Soheil Yousefi et al. Genome Med. .

Abstract

Background: Non-coding regulatory elements (NCREs), such as enhancers, play a crucial role in gene regulation, and genetic aberrations in NCREs can lead to human disease, including brain disorders. The human brain is a complex organ that is susceptible to numerous disorders; many of these are caused by genetic changes, but a multitude remain currently unexplained. Understanding NCREs acting during brain development has the potential to shed light on previously unrecognized genetic causes of human brain disease. Despite immense community-wide efforts to understand the role of the non-coding genome and NCREs, annotating functional NCREs remains challenging.

Methods: Here we performed an integrative computational analysis of virtually all currently available epigenome data sets related to human fetal brain.

Results: Our in-depth analysis unravels 39,709 differentially active enhancers (DAEs) that show dynamic epigenomic rearrangement during early stages of human brain development, indicating likely biological function. Many of these DAEs are linked to clinically relevant genes, and functional validation of selected DAEs in cell models and zebrafish confirms their role in gene regulation. Compared to enhancers without dynamic epigenomic rearrangement, DAEs are subjected to higher sequence constraints in humans, have distinct sequence characteristics and are bound by a distinct transcription factor landscape. DAEs are enriched for GWAS loci for brain-related traits and for genetic variation found in individuals with neurodevelopmental disorders, including autism.

Conclusion: This compendium of high-confidence enhancers will assist in deciphering the mechanism behind developmental genetics of human brain and will be relevant to uncover missing heritability in human genetic brain disorders.

Keywords: Clinical genetics; Computational analysis; Data integration; Enhancer; Epigenome; Gene regulatory elements; Human brain development; Mendelian disorders; Meta-analysis; Non-coding genome.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Integrative analysis of brain enhancers during fetal development. A Various steps taken in the integrative analysis of this study. See text for details. B Functional enrichment analysis using GREAT [17], for DAEs (upper panel, n = 39,709) and nDAEs (lower panel, n = 162,454), determined using whole genome as a background. X-axis reports the − Log10 p value as determined by GREAT. C Venn diagram showing the overlap between DAEs (upper panel) and nDAEs (lower panel) interacting with protein-coding and lincRNA genes in CP (left) and GZ (right). D Venn diagram showing the overlap between interactions of protein-coding and lincRNA genes with nDAEs (left) and DAEs (right), for protein-coding and lincRNA genes in CP (upper panel) and GZ (lower panel). E Box plots showing gene expression levels as determined by RNA-seq, for genes that interact by HiC with DAEs (light gray) or nDAEs (dark gray) in CP (left) and GZ (right), for fetal (red) or adult (blue) brain samples. Boxes are interquartile range (IQR); line is median; and whiskers extend to 1.5 the IQR. PCW, postconceptional week. FPKM, fragments per kilobase of transcript per million mapped reads. * p < 0.05; ** p < 0.01; *** p < 0.001; ns, not significant (wilcox.test). Data obtained from: 12 PCW, Yan et al [18]; 15-17 PCW, De la Torre-Ubieta et al [19]; 17 PCW, Roadmap [10]; 81 years, Roadmap [10]; mean of fetal sources is the mean expression of the first three fetal samples. F Box plots showing RNA-seq gene expression for genes interacting with 1, 2, 3, 4, or 5 or more DAEs in CP (left) and GZ (right). Left y-axis shows gene expression (log2 FPKM), right y-axis, and line plot shows the number of genes per DAE group. * p < 0.05; ***p < 0.001; ****p < 0.0001 (wilcox.test). RNA-seq data from Allen human brain atlas [20]. G Bar plot showing the percentage of GFP+ cells in NSCs (blue) and HEK cells (red), from cell transfection experiments with an enhancer reporter plasmid for 22 tested enhancers and an empty plasmid control. Plotted is the percentage of GFP+ in cells co-transfected with an mCherry expressing plasmid, to correct for transfection efficiency. Bars show the average from two independent experiments, with each enhancer tested each in duplicate. Error bars represent standard deviation. * p < 0.05; ** p < 0.01; *** p < 0.001; **** p < 0.0001 (one-way ANOVA test followed by multiple comparison test (Fisher’s LSD test)
Fig. 2
Fig. 2
Distinct sequence characteristics between DAEs and nDAEs. A Line graph showing the number of protein-coding and lincRNA genes (1, 2, 3, 4, or 5 or more) that each DAE is interacting with, and the number of DAEs per category, for CP (red) and GZ (blue). B As A, but here for nDAEs. C Box plots showing the median ncER percentile (left) [38], GC content score (middle) [39] or phastcons score (right) [39] for DAEs-CP (red) and DAEs-GZ (blue) that interact with 1, 2, 3, 4, or 5 or more protein-coding and lincRNA genes. Boxes are IQR; line is median; and whiskers extend to 1.5 the IQR. * p < 0.05; ** p < 0.01; *** p < 0.001; **** p < 0.0001; ns, not significant (wilcox.test). D As C, but here for nDAEs. E Box plots, showing from left to right ncER percentile [38], GC content score [39], phastcons score [39], Orion score [40], and CADD score [41], for all DAEs (light gray) and nDAEs (dark gray), or for those DAEs and nDAEs that are interacting in CP or GZ with protein-coding and lincRNA genes (red) or genes with a known OMIM phenotype (blue). Boxes are IQR; line is median; and whiskers extend to 1.5 the IQR. * p < 0.05; ** p < 0.01; *** p < 0.001; ns, not significant (wilcox.test). F Box plot showing the pLI score [42] of genes interacting with DAEs (light gray) and nDAEs (dark gray) in CP or GZ. Boxes are IQR; line is median; and whiskers extend to 1.5 the IQR. *** p < 0.001; (wilcox.test). G Kernel density plot showing the distribution of loss-of-function tolerance scores for non-coding sequences [43] for all DAEs (light gray), all nDAEs (dark gray), DAEs linked to protein-coding and lincRNA genes in CP (red), DAEs linked to protein-coding and lincRNA genes in GZ (green), nDAEs linked to protein-coding and lincRNA genes in CP (orange), and nDAEs linked to protein-coding and lincRNA genes in GZ (yellow). H Bar chart showing the most enriched transposable elements (TEs) overlapping with from left to right all nDAEs, all DAEs, DAEs interacting with protein-coding and lincRNA genes in CP, and DAEs interacting with protein-coding genes in GZ. Plotted is a ratio between the observed (O) number of TEs over the expected (E). Different classes of TE are indicated with different colors as indicated
Fig. 3
Fig. 3
DAEs and nDAEs are enriched for distinct transcription factor binding sites. A Line plot showing the distribution of the mean ncER percentile (left) [38], GC content score (middle) [39], and phastcons score (right) [39] over the relative bin position for all DAEs. B Line plot showing the log2 enrichment for various epigenome features as indicated, over the relative bin positions for all DAEs. Different colors indicate different time points of human brain development and different brain regions from which the data were obtained. DFC, dorsal frontal cortex; CBC, cerebellar cortex; OC, occipital cortex; FC, frontal cortex; CP, cortical plate; GZ, germinal zone; Brain, whole brain. Epigenome data used are summarized in Additional file 3: Table S2. C Bar chart showing the relative LOLA enrichment of TFs from JASPAR in all DAEs (light gray), in the central part of all DAEs (ncER subset, orange), in DAEs linked to genes in CP (red) and in DAEs linked to genes in GZ (blue). X-axis displays the rank score (a combination of p value, odds ratio from Fisher’s exact test, and the raw number of overlapping regions) from LOLA. The rank was re-scaled between 0 and 100, so that DAEs with a larger TFs enrichment have a higher rank. Also shown is a heatmap showing the RNA-seq expression levels (Log2 FPKM) of the same TFs across various human fetal tissues. RNA-seq data obtained from ENCODE project [7]. D As in A, but here for nDAEs. E As in B, but here for nDAEs. Note the difference in y-axis scale for H3K4me3 and H3K27me3 compared to panel B given the higher enrichment in nDAEs. F As in C, but now for nDAEs. G Line plot showing the distribution of enrichment (− log10 p value as determined by HOMER analysis) across the relative DAE bins, for the 251 TF motifs that were not equally enriched in all 20 bin groups. The most enriched TF motifs are indicated. H As G, but now for 218 TFs that were not equally enriched across the 20 bin groups of all nDAEs
Fig. 4
Fig. 4
Clustering of DAEs unravels temporal dynamics of brain gene regulation. A Heatmap displaying all available epigenome features for PCW 8-12, across all DAEs interacting with protein-coding genes in CP (upper heatmap) and GZ (lower heatmap) (AI). K-means clustering analysis of epigenome features (AII) identifies two clusters, cluster 1 (red) and cluster 2 (green). Level of enrichment is indicated on the y-axis in Log2 TPM. Box plots (AIII) shows RNA-seq gene expression of protein-coding genes regulated by the DAEs from each cluster (Expression pattern), for available data from PCW 8, 9, and 12 [20]. Boxes are IQR; line is median; and whiskers extend to 1.5 the IQR. Gene enrichment analysis for the corresponding genes in each cluster (AIV). X-axis shows the − log 10 (p value) from Enrichr. B As for A, but now for PCW 13–18. C As for A, but now for PCW > 18
Fig. 5
Fig. 5
Variants in DAEs and nDAEs are associated with human disease. A Bar graph showing the number of DAEs linked to their target genes in CP and GZ and their most enriched OMIM phenotypes. B Plot showing the top-25 GWAS phenotypes that are enriched in DAEs compared to nDAEs (log2 odds ratio DAE/nDAE). C Line graph showing the odds ratio, confidence interval, and p value for enrichment of CNVs from an ASD cohort at DAEs and nDAEs. CNVs data obtained from Brandler et al [44]. * p < 0.05; ** p < 0.01 (Fisher’s exact test). D Genome browser track showing the regulatory landscape of the GABRD gene. Indicated are a DAE (chr1: 1,840,449-1,840,835) that is interacting with the GABRD promoter, and a deletion (chr1: 1,840,001-1,845,000) that is found in an epilepsy patient (CNET0068) from Monlong et al. [45].* p < 0.05 (Fisher’s exact test). E Line graph showing the odds ratio, confidence interval, and p value for enrichment of SNV from an ASD cohort at DAEs and nDAEs. SNV data obtained from Zhou et al. [46]
Fig. 6
Fig. 6
CRISPRi and zebrafish experiments validate activity of DAEs regulating genes involved in neurogenetic disorders. A Genome browser tracks showing enhancers interacting with CHD2 (left), CAD (middle), and TRAK1 (right). Shown are RNA-seq expression profiles, various histone modifications, and ATAC-seq and DNase profiles for various time points during human fetal brain development, as indicated. The tested DAEs are indicated by the box. B Representative fluorescent images showing GFP expression of transgenic enhancer reporter assays in zebrafish larvae at 1, 2, and 3 dpf. Tested are the enhancers for CHD2, CAD, and TRAK1 (shown in A), and two additional enhancers for MACF1 and TUBB2A. The five tested enhancers induced GFP expression in the head of the larvae, amongst others in the forebrain in 61.1%, 81.8%, and 87.9% larvae for CHD2; 88.9%, 85.4%, and 85.7% for CAD; 87.1%, 70%, and 88.5% for TRAK1; 81.5%, 85.7%, and 76.2% for MACF1; and 87.5%, 100%, and 100% for TUBB2A, respectively at 1, 2, and 3 dpf. Also peripheral neuron-specific GFP expression was found, with 0%, 60.6%, and 21.2% for CHD2; 68.9%, 24.4%, and 51.4% for CAD; 83.6%, 65.5%, and 67.3% for TRAK1; 37%, 50%, and 33.3% for MACF1; and 50%, 83.3%, and 63.3% for TUBB2A, respectively at 1, 2, and 3 dpf. See also Additional file 14: Table S13. Scale bars represent 500 μm. C Bright-field image of a wild type zebrafish larvae at 3 dpf (lateral view), with the anatomical sites that were scored for GFP expression indicated. D qRT-PCR showing reduction of CHD2, CAD, and TRAK1 expression in NSCs upon silencing of respective enhancer by dCas9-KRAB-MECP2. Data represent fold change of expression of respective genes compared to mock transfected cells (KRAB-MECP2 plasmid only, no gRNA plasmid). Two independent transfection experiments were performed, each in duplicate. All data points and standard deviation are shown. ** p < 0.01; **** p < 0.0001 (one-way ANOVA test followed by multiple comparison test (Fisher’s LSD test). E qRT-PCR showing reduction of REST expression in NSCs upon silencing of CHD2, CAD, or TRAK1 enhancers by dCas9-KRAB-MECP2. Data represent fold change of REST expression compared to mock transfected cells (KRAB-MECP2 plasmid only, no gRNA plasmid). Two independent transfection experiments were performed, each in duplicate. All data points and standard deviation are shown. ** p < 0.01 (one-way ANOVA test followed by multiple comparison test (Fisher’s LSD test)

References

    1. Spitz F, Furlong EE. Transcription factors: from enhancer binding to developmental control. Nat Rev Genet. 2012;13:613–626. doi: 10.1038/nrg3207. - DOI - PubMed
    1. Nord AS, West AE. Neurobiological functions of transcriptional enhancers. Nat Neurosci. 2020;23:5–14. doi: 10.1038/s41593-019-0538-5. - DOI - PMC - PubMed
    1. D'Haene E, Vergult S. Interpreting the impact of noncoding structural variation in neurodevelopmental disorders. Genet Med. 2021;23:34–46. doi: 10.1038/s41436-020-00974-1. - DOI - PMC - PubMed
    1. Perenthaler E, Yousefi S, Niggl E, Barakat TS. Beyond the exome: the non-coding genome and enhancers in neurodevelopmental disorders and malformations of cortical development. Front Cell Neurosci. 2019;13:352. doi: 10.3389/fncel.2019.00352. - DOI - PMC - PubMed
    1. Carullo NVN, Day JJ. Genomic enhancers in brain health and disease. Genes (Basel). 2019;10(1):43. - PMC - PubMed

Publication types

Substances