maxATAC: Genome-scale transcription-factor binding prediction from ATAC-seq with deep neural networks
- PMID: 36719906
- PMCID: PMC9917285
- DOI: 10.1371/journal.pcbi.1010863
maxATAC: Genome-scale transcription-factor binding prediction from ATAC-seq with deep neural networks
Abstract
Transcription factors read the genome, fundamentally connecting DNA sequence to gene expression across diverse cell types. Determining how, where, and when TFs bind chromatin will advance our understanding of gene regulatory networks and cellular behavior. The 2017 ENCODE-DREAM in vivo Transcription-Factor Binding Site (TFBS) Prediction Challenge highlighted the value of chromatin accessibility data to TFBS prediction, establishing state-of-the-art methods for TFBS prediction from DNase-seq. However, the more recent Assay-for-Transposase-Accessible-Chromatin (ATAC)-seq has surpassed DNase-seq as the most widely-used chromatin accessibility profiling method. Furthermore, ATAC-seq is the only such technique available at single-cell resolution from standard commercial platforms. While ATAC-seq datasets grow exponentially, suboptimal motif scanning is unfortunately the most common method for TFBS prediction from ATAC-seq. To enable community access to state-of-the-art TFBS prediction from ATAC-seq, we (1) curated an extensive benchmark dataset (127 TFs) for ATAC-seq model training and (2) built "maxATAC", a suite of user-friendly, deep neural network models for genome-wide TFBS prediction from ATAC-seq in any cell type. With models available for 127 human TFs, maxATAC is the largest collection of high-performance TFBS prediction models for ATAC-seq. maxATAC performance extends to primary cells and single-cell ATAC-seq, enabling improved TFBS prediction in vivo. We demonstrate maxATAC's capabilities by identifying TFBS associated with allele-dependent chromatin accessibility at atopic dermatitis genetic risk loci.
Copyright: © 2023 Cazares et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Conflict of interest statement
I have read the journal’s policy and the authors of this manuscript have the following competing interests: AB is a co-founder of Datirium, LLC.
Figures







Similar articles
-
Profiling of chromatin accessibility identifies transcription factor binding sites across the genome of Aspergillus species.BMC Biol. 2021 Sep 6;19(1):189. doi: 10.1186/s12915-021-01114-0. BMC Biol. 2021. PMID: 34488759 Free PMC article.
-
XL-DNase-seq: improved footprinting of dynamic transcription factors.Epigenetics Chromatin. 2019 Jun 4;12(1):30. doi: 10.1186/s13072-019-0277-6. Epigenetics Chromatin. 2019. PMID: 31164146 Free PMC article.
-
ATAC-STARR-seq reveals transcription factor-bound activators and silencers within chromatin-accessible regions of the human genome.Genome Res. 2022 Aug 25;32(8):1529-1541. doi: 10.1101/gr.276766.122. Genome Res. 2022. PMID: 35858748 Free PMC article.
-
[Advances in assay for transposase-accessible chromatin with high-throughput sequencing].Yi Chuan. 2020 Apr 20;42(4):333-346. doi: 10.16288/j.yczz.19-279. Yi Chuan. 2020. PMID: 32312702 Review. Chinese.
-
Chromatin accessibility profiling by ATAC-seq.Nat Protoc. 2022 Jun;17(6):1518-1552. doi: 10.1038/s41596-022-00692-9. Epub 2022 Apr 27. Nat Protoc. 2022. PMID: 35478247 Free PMC article. Review.
Cited by
-
Machine-guided design of synthetic cell type-specific cis-regulatory elements.bioRxiv [Preprint]. 2023 Aug 9:2023.08.08.552077. doi: 10.1101/2023.08.08.552077. bioRxiv. 2023. Update in: Nature. 2024 Oct;634(8036):1211-1220. doi: 10.1038/s41586-024-08070-z. PMID: 37609287 Free PMC article. Updated. Preprint.
-
Comprehensive transcription factor perturbations recapitulate fibroblast transcriptional states.Nat Genet. 2025 Aug 6. doi: 10.1038/s41588-025-02284-1. Online ahead of print. Nat Genet. 2025. PMID: 40770575
-
CREaTor: zero-shot cis-regulatory pattern modeling with attention mechanisms.Genome Biol. 2023 Nov 23;24(1):266. doi: 10.1186/s13059-023-03103-8. Genome Biol. 2023. PMID: 37996959 Free PMC article.
-
Predmoter-cross-species prediction of plant promoter and enhancer regions.Bioinform Adv. 2024 May 24;4(1):vbae074. doi: 10.1093/bioadv/vbae074. eCollection 2024. Bioinform Adv. 2024. PMID: 38841126 Free PMC article.
-
Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights.Bioessays. 2024 Jul;46(7):e2300210. doi: 10.1002/bies.202300210. Epub 2024 May 8. Bioessays. 2024. PMID: 38715516 Free PMC article. Review.
References
-
- Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, et al.. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences. 2009;106: 9362–9367. doi: 10.1073/pnas.0903103106 - DOI - PMC - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- U01 AI130830/AI/NIAID NIH HHS/United States
- R01 NS099068/NS/NINDS NIH HHS/United States
- R01 AI153442/AI/NIAID NIH HHS/United States
- R01 DK107502/DK/NIDDK NIH HHS/United States
- U01 AI150748/AI/NIAID NIH HHS/United States
- R21 AI156185/AI/NIAID NIH HHS/United States
- R01 HG010730/HG/NHGRI NIH HHS/United States
- R01 GM055479/GM/NIGMS NIH HHS/United States
- R01 AI024717/AI/NIAID NIH HHS/United States
- P30 AR070549/AR/NIAMS NIH HHS/United States
- R01 AI148276/AI/NIAID NIH HHS/United States
- U19 AI070235/AI/NIAID NIH HHS/United States
- P01 AI150585/AI/NIAID NIH HHS/United States
- R01 AR073228/AR/NIAMS NIH HHS/United States
- U01 HG011172/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources