This is a preprint.
scooby: Modeling multi-modal genomic profiles from DNA sequence at single-cell resolution
- PMID: 39345504
- PMCID: PMC11429888
- DOI: 10.1101/2024.09.19.613754
scooby: Modeling multi-modal genomic profiles from DNA sequence at single-cell resolution
Abstract
Understanding how regulatory DNA elements shape gene expression across individual cells is a fundamental challenge in genomics. Joint RNA-seq and epigenomic profiling provides opportunities to build unifying models of gene regulation capturing sequence determinants across steps of gene expression. However, current models, developed primarily for bulk omics data, fail to capture the cellular heterogeneity and dynamic processes revealed by single-cell multi-modal technologies. Here, we introduce scooby, the first framework to model scRNA-seq coverage and scATAC-seq insertion profiles along the genome from sequence at single-cell resolution. For this, we leverage the pre-trained multi-omics profile predictor Borzoi as a foundation model, equip it with a cell-specific decoder, and fine-tune its sequence embeddings. Specifically, we condition the decoder on the cell position in a precomputed single-cell embedding resulting in strong generalization capability. Applied to a hematopoiesis dataset, scooby recapitulates cell-specific expression levels of held-out genes, and identifies regulators and their putative target genes through in silico motif deletion. Moreover, accurate variant effect prediction with scooby allows for breaking down bulk eQTL effects into single-cell effects and delineating their impact on chromatin accessibility and gene expression. We anticipate scooby to aid unraveling the complexities of gene regulation at the resolution of individual cells.
Figures





Similar articles
-
Prescription of Controlled Substances: Benefits and Risks.2025 Jul 6. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025 Jan–. 2025 Jul 6. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025 Jan–. PMID: 30726003 Free Books & Documents.
-
Integrative Single-Cell RNA-Seq and ATAC-Seq Identifies Transcriptional and Epigenetic Blueprint Guiding Osteoclastogenic Trajectory.J Bone Miner Res. 2025 Jun 19:zjaf084. doi: 10.1093/jbmr/zjaf084. Online ahead of print. J Bone Miner Res. 2025. PMID: 40577680
-
Joint analysis of single-cell RNA sequencing and bulk transcriptome reveals the heterogeneity of the urea cycle of astrocytes in glioblastoma.Neurobiol Dis. 2025 May;208:106835. doi: 10.1016/j.nbd.2025.106835. Epub 2025 Feb 10. Neurobiol Dis. 2025. PMID: 39938577
-
Precision Neuro-Oncology in Glioblastoma: AI-Guided CRISPR Editing and Real-Time Multi-Omics for Genomic Brain Surgery.Int J Mol Sci. 2025 Jul 30;26(15):7364. doi: 10.3390/ijms26157364. Int J Mol Sci. 2025. PMID: 40806492 Free PMC article. Review.
-
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3. Syst Rev. 2024. PMID: 39593159 Free PMC article.
References
-
- Sasse A., Chikina M. & Mostafavi S. Unlocking gene regulation with sequence-to-function models. Nat. Methods 21, 1374–1377 (2024). - PubMed
-
- Agarwal V. & Shendure J. Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks. Cell Rep. 31, (2020). - PubMed
-
- Alipanahi B., Delong A., Weirauch M. T. & Frey B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015). - PubMed
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources