Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Mar 25:2024.08.21.609075.
doi: 10.1101/2024.08.21.609075.

Evaluating Methods for the Prediction of Cell Type-Specific Enhancers in the Mammalian Cortex

Affiliations

Evaluating Methods for the Prediction of Cell Type-Specific Enhancers in the Mammalian Cortex

Nelson J Johansen et al. bioRxiv. .

Update in

  • Evaluating methods for the prediction of cell-type-specific enhancers in the mammalian cortex.
    Johansen NJ, Kempynck N, Zemke NR, Somasundaram S, De Winter S, Hooper M, Dwivedi D, Lohia R, Wehbe F, Li B, Abaffyová D, Armand EJ, De Man J, Ekşi EC, Hecker N, Hulselmans G, Konstantakos V, Mauduit D, Mich JK, Partel G, Daigle TL, Levi BP, Zhang K, Tanaka Y, Gillis J, Ting JT, Ben-Simon Y, Miller J, Ecker JR, Ren B, Aerts S, Lein ES, Tasic B, Bakken TE. Johansen NJ, et al. Cell Genom. 2025 Jun 11;5(6):100879. doi: 10.1016/j.xgen.2025.100879. Epub 2025 May 21. Cell Genom. 2025. PMID: 40403730 Free PMC article.

Abstract

Identifying cell type-specific enhancers in the brain is critical to building genetic tools for investigating the mammalian brain. Computational methods for functional enhancer prediction have been proposed and validated in the fruit fly and not yet the mammalian brain. We organized the 'Brain Initiative Cell Census Network (BICCN) Challenge: Predicting Functional Cell Type-Specific Enhancers from Cross-Species Multi-Omics' to assess machine learning and feature-based methods designed to nominate enhancer DNA sequences to target cell types in the mouse cortex. Methods were evaluated based on in vivo validation data from hundreds of cortical cell type-specific enhancers that were previously packaged into individual AAV vectors and retro-orbitally injected into mice. We find that open chromatin was a key predictor of functional enhancers, and sequence models improved prediction of non-functional enhancers that can be deprioritized as opposed to pursued for in vivo testing. Sequence models also identified cell type-specific transcription factor codes that can guide designs of in silico enhancers. This community challenge establishes a benchmark for enhancer prioritization algorithms and reveals computational approaches and molecular information that are crucial for identifying functional enhancers in mammalian cortical cell types. The results of this challenge bring us closer to understanding the complex gene regulatory landscape of the mammalian cortex and to designing more efficient genetic tools to target cortical cell types.

Keywords: AAV; ATAC-seq; DNA methylation; HiC; RNA-Seq; benchmark; cell types; challenge; cortex; enhancer; machine learning; mouse; multiome; primates.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests: None declared.

Figures

Figure 1.
Figure 1.. Overview of the enhancer prioritization challenge.
(A) Single nucleus multi-omic data from primary motor cortex of human, macaque, marmoset, and mouse. mya, million years ago. (B) Schematic of the computational challenge to prioritize candidate cell type-specific enhancers. (C) Overview of AAV construction, cell type ATAC-seq specificity, and screening of in vivo activity in the mouse brain for three candidate L5 ET enhancers. (D) Teams predicted and ranked 10,000 candidate enhancers for each of 19 cortical cell types and were scored based on prioritization of strong, On-Target enhancers. (E) Combinations of data and methods for top team submissions. (F) Normalized benchmark metrics (Methods) based on epifluorescence and SSv4 from in vivo screening.
Figure 2.
Figure 2.. Comparison of team enhancer rankings.
(A) Average proportion of ranked enhancers that overlap between pairs of team submissions for all cell types. (B) Upset plot showing the number of validated enhancers that were identified by sets of submissions. (C,D) Rates of identification of (C) On-Target and (D) Mixed-Target, Off-Target and No-Labeling enhancers. (E) Comparison of methods based on distributions of normalized enrichment scores (NES). For each method and cell type specific ranking, NES measures the area under the recovery curve (AUC) up to the 1,000th element compared to a random ranking. (F) Heatmap ordered by Aerts scATACtriplet scoring of L5 ET enhancers and summary of validation results. Examples of a strong (AiE0456m) and weak (AiE0460m) enhancer with Pou3f1 motifs identified by the CREsted model in the highlighted region. AiE0456m was also validated with SSv4.
Figure 3.
Figure 3.. Enhancer features predictive of functional activity.
(A) Comparison of molecular features between On-Target and other enhancer categories. ** P < 0.001, *** P < 0.0001 Wilcoxon rank-sum test two-sided, unpaired. (B) Correlation of H3K27ac and ATAC-seq specificity for astrocyte enhancers. (C) Examples of astrocyte enhancers with in vivo activity that is better predicted by H3K27ac than ATAC-seq signal. (D) Summary of informative features from a Random Forest model predicting enhancer activity. ANOVA with Tukey post hoc tests, Bonferroni-corrected P-values. (E) Schematic of ATAC-seq peak quantification based on cut sites or coverage. Box plot comparison of peak specificity for all on-target enhancers between different preprocessing methods. Adjusted p-values were obtained through t-tests, *P < 0.05, **P < 0.01. (F) Overall enhancer activity prediction performance from peak specificity for the different methods for On-target (left) and all, except Off-target, (right) enhancers.
Figure 4.
Figure 4.. Refinement of models and enhancer screening results.
(A) tSNE plots of enhancers based on ATAC-seq specificity and labeled by the targeted cell type. On-Target and No-Labeling enhancers had explainable or unexplainable cell type labeling patterns based on ATAC-seq and DNA sequence (CREsted) model predictions. (B) River plots of enhancer activity, predictions and rescoring of experimental validation data. (C-D) Model scores, predicted TF motifs and SYFP fluorescence for two Oligo enhancers with epifluorescence strengths (C) strong On-Target and (D) No-Labeling rescored to weak On-Target activity. (E) Performance of enhancer ranking methods using the rescored enhancer activities. AP, average precision. scATAC included two normalizations: count-normalized coverage pseudobulk or peak-scaled. (F) CREsted model scores for strong and weak On-Target enhancers grouped by cell type. Mean +/− SEM. (G) Comparison of models at identifying No-Labeling enhancers. * P < 0.05, *** P < 0.001, Wilcoxon rank-sum test, Bonferroni-corrected P-values.

References

    1. Bakken T.E., Jorstad N.L., Hu Q., Lake B.B., Tian W., Kalmbach B.E., Crow M., Hodge R.D., Krienen F.M., Sorensen S.A., et al. (2021). Comparative cellular analysis of motor cortex in human, marmoset and mouse. Nature 598, 111–119. 10.1038/s41586-021-03465-8. - DOI - PMC - PubMed
    1. Yao Z., Liu H., Xie F., Fischer S., Adkins R.S., Aldridge A.I., Ament S.A., Bartlett A., Behrens M.M., Van den Berge K., et al. (2021). A transcriptomic and epigenomic cell atlas of the mouse primary motor cortex. Nature 598, 103–110. 10.1038/s41586-021-03500-8. - DOI - PMC - PubMed
    1. McColgan P., Joubert J., Tabrizi S.J., and Rees G. (2020). The human motor cortex microcircuit: insights for neurodegenerative disease. Nat. Rev. Neurosci. 21, 401–415. 10.1038/s41583-020-0315-1. - DOI - PubMed
    1. Mich J.K., Graybuck L.T., Hess E.E., Mahoney J.T., Kojima Y., Ding Y., Somasundaram S., Miller J.A., Kalmbach B.E., Radaelli C., et al. (2021). Functional enhancer elements drive subclass-selective expression from mouse to primate neocortex. Cell Rep. 34, 108754. 10.1016/j.celrep.2021.108754. - DOI - PMC - PubMed
    1. Graybuck L.T., Daigle T.L., Sedeño-Cortés A.E., Walker M., Kalmbach B., Lenz G.H., Morin E., Nguyen T.N., Garren E., Bendrick J.L., et al. (2021). Enhancer viruses for combinatorial cell-subclass-specific labeling. Neuron 109, 1449–1464.e13. 10.1016/j.neuron.2021.03.011. - DOI - PMC - PubMed

Publication types

LinkOut - more resources