MMGraph: a multiple motif predictor based on graph neural network and coexisting probability for ATAC-seq data
- PMID: 35997564
- PMCID: PMC9524997
- DOI: 10.1093/bioinformatics/btac572
MMGraph: a multiple motif predictor based on graph neural network and coexisting probability for ATAC-seq data
Abstract
Motivation: Transcription factor binding sites (TFBSs) prediction is a crucial step in revealing functions of transcription factors from high-throughput sequencing data. Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) provides insight on TFBSs and nucleosome positioning by probing open chromatic, which can simultaneously reveal multiple TFBSs compare to traditional technologies. The existing tools based on convolutional neural network (CNN) only find the fixed length of TFBSs from ATAC-seq data. Graph neural network (GNN) can be considered as the extension of CNN, which has great potential in finding multiple TFBSs with different lengths from ATAC-seq data.
Results: We develop a motif predictor called MMGraph based on three-layer GNN and coexisting probability of k-mers for finding multiple motifs from ATAC-seq data. The results of the experiment which has been conducted on 88 ATAC-seq datasets indicate that MMGraph has achieved the best performance on area of eight metrics radar score of 2.31 and could find 207 higher-quality multiple motifs than other existing tools.
Availability and implementation: MMGraph is wrapped in Python package, which is available at https://github.com/zhangsq06/MMGraph.git.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2022. Published by Oxford University Press.
Figures
Similar articles
-
GNNMF: a multi-view graph neural network for ATAC-seq motif finding.BMC Genomics. 2024 Mar 21;25(1):300. doi: 10.1186/s12864-024-10218-0. BMC Genomics. 2024. PMID: 38515040 Free PMC article.
-
MMGAT: a graph attention network framework for ATAC-seq motifs finding.BMC Bioinformatics. 2024 Apr 20;25(1):158. doi: 10.1186/s12859-024-05774-x. BMC Bioinformatics. 2024. PMID: 38643066 Free PMC article.
-
CacPred: a cascaded convolutional neural network for TF-DNA binding prediction.BMC Genomics. 2025 Mar 18;26(Suppl 2):264. doi: 10.1186/s12864-025-11399-y. BMC Genomics. 2025. PMID: 40102719 Free PMC article.
-
[Advances in assay for transposase-accessible chromatin with high-throughput sequencing].Yi Chuan. 2020 Apr 20;42(4):333-346. doi: 10.16288/j.yczz.19-279. Yi Chuan. 2020. PMID: 32312702 Review. Chinese.
-
Chromatin accessibility profiling by ATAC-seq.Nat Protoc. 2022 Jun;17(6):1518-1552. doi: 10.1038/s41596-022-00692-9. Epub 2022 Apr 27. Nat Protoc. 2022. PMID: 35478247 Free PMC article. Review.
Cited by
-
GNNMF: a multi-view graph neural network for ATAC-seq motif finding.BMC Genomics. 2024 Mar 21;25(1):300. doi: 10.1186/s12864-024-10218-0. BMC Genomics. 2024. PMID: 38515040 Free PMC article.
-
Uncovering uncharacterized binding of transcription factors from ATAC-seq footprinting data.Sci Rep. 2024 Apr 23;14(1):9275. doi: 10.1038/s41598-024-59989-2. Sci Rep. 2024. PMID: 38654130 Free PMC article.
-
MMGAT: a graph attention network framework for ATAC-seq motifs finding.BMC Bioinformatics. 2024 Apr 20;25(1):158. doi: 10.1186/s12859-024-05774-x. BMC Bioinformatics. 2024. PMID: 38643066 Free PMC article.
References
-
- Alipanahi B. et al. (2015) Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nat. Biotechnol., 33, 831–838. - PubMed
-
- Colonnese S. et al. (2021) Protein-Protein Interaction Prediction via Graph Signal Processing. In: IEEE Access, vol. 9, pp. 142681–142692. https://doi.org/10.1109/ACCESS.2021.3119569.
-
- Norouzi M. et al. (2012) Hamming distance metric learning. In: Advances in Neural Information Processing Systems, vol. 25, MIT Press.