A community effort to optimize sequence-based deep learning models of gene regulation
- PMID: 39394483
- PMCID: PMC12339383
- DOI: 10.1038/s41587-024-02414-w
A community effort to optimize sequence-based deep learning models of gene regulation
Abstract
A systematic evaluation of how model architectures and training strategies impact genomics model performance is needed. To address this gap, we held a DREAM Challenge where competitors trained models on a dataset of millions of random promoter DNA sequences and corresponding expression levels, experimentally determined in yeast. For a robust evaluation of the models, we designed a comprehensive suite of benchmarks encompassing various sequence types. All top-performing models used neural networks but diverged in architectures and training strategies. To dissect how architectural and training choices impact performance, we developed the Prix Fixe framework to divide models into modular building blocks. We tested all possible combinations for the top three models, further improving their performance. The DREAM Challenge models not only achieved state-of-the-art results on our comprehensive yeast dataset but also consistently surpassed existing benchmarks on Drosophila and human genomic datasets, demonstrating the progress that can be driven by gold-standard genomics datasets.
© 2024. The Author(s).
Conflict of interest statement
Competing interests: E.D.V. is the founder of Sequome, Inc. A.R. is an employee of Genentech and has equity in Roche. A.R. is a cofounder and equity holder of Celsius Therapeutics, an equity holder in Immunitas and, until July 31, 2020, was a scientific advisory board member of Thermo Fisher Scientific, Syros Pharmaceuticals, Neogene Therapeutics and Asimov. A.R. was an Investigator of the Howard Hughes Medical Institute when this work was initiated. The remaining authors declare no competing interests.
Figures
Update of
-
Evaluation and optimization of sequence-based gene regulatory deep learning models.bioRxiv [Preprint]. 2024 Feb 17:2023.04.26.538471. doi: 10.1101/2023.04.26.538471. bioRxiv. 2024. Update in: Nat Biotechnol. 2025 Aug;43(8):1373-1383. doi: 10.1038/s41587-024-02414-w. PMID: 38405704 Free PMC article. Updated. Preprint.
References
-
- Phillips, T. Regulation of transcription and gene expression in eukaryotes. Nat. Educ.1, 199 (2008).
-
- Cramer, P. Organization and regulation of gene transcription. Nature573, 45–54 (2019). - PubMed
-
- Field, A. & Adelman, K. Evaluating enhancer function and transcription. Annu. Rev. Biochem.89, 213–234 (2020). - PubMed
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Molecular Biology Databases
Miscellaneous
