Patterns of transcription factor binding and epigenome at promoters allow interpretable predictability of multiple functions of non-coding and coding genes
- PMID: 37520281
- PMCID: PMC10371796
- DOI: 10.1016/j.csbj.2023.07.014
Patterns of transcription factor binding and epigenome at promoters allow interpretable predictability of multiple functions of non-coding and coding genes
Abstract
Understanding the biological roles of all genes only through experimental methods is challenging. A computational approach with reliable interpretability is needed to infer the function of genes, particularly for non-coding RNAs. We have analyzed genomic features that are present across both coding and non-coding genes like transcription factor (TF) and cofactor ChIP-seq (823), histone modifications ChIP-seq (n = 621), cap analysis gene expression (CAGE) tags (n = 255), and DNase hypersensitivity profiles (n = 255) to predict ontology-based functions of genes. Our approach for gene function prediction was reliable (>90% balanced accuracy) for 486 gene-sets. PubMed abstract mining and CRISPR screens supported the inferred association of genes with biological functions, for which our method had high accuracy. Further analysis revealed that TF-binding patterns at promoters have high predictive strength for multiple functions. TF-binding patterns at the promoter add an unexplored dimension of explainable regulatory aspects of genes and their functions. Therefore, we performed a comprehensive analysis for the functional-specificity of TF-binding patterns at promoters and used them for clustering functions to reveal many latent groups of gene-sets involved in common major cellular processes. We also showed how our approach could be used to infer the functions of non-coding genes using the CRISPR screens of coding genes, which were validated using a long non-coding RNA CRISPR screen. Thus our results demonstrated the generality of our approach by using gene-sets from CRISPR screens. Overall, our approach opens an avenue for predicting the involvement of non-coding genes in various functions.
Keywords: Coregulation of functions; Epigenetics; Functional genomics; Gene function prediction; Gene regulation; General transcription factor (GTF); LncRNA); Long noncoding RNA (long ncRNA.
© 2023 The Authors.
Conflict of interest statement
Author declare no conflict of interest.
Figures







Similar articles
-
Promoter analysis reveals globally differential regulation of human long non-coding RNA and protein-coding genes.PLoS One. 2014 Oct 2;9(10):e109443. doi: 10.1371/journal.pone.0109443. eCollection 2014. PLoS One. 2014. PMID: 25275320 Free PMC article.
-
Comprehensive Identification of Long Non-coding RNAs in Purified Cell Types from the Brain Reveals Functional LncRNA in OPC Fate Determination.PLoS Genet. 2015 Dec 18;11(12):e1005669. doi: 10.1371/journal.pgen.1005669. eCollection 2015 Dec. PLoS Genet. 2015. PMID: 26683846 Free PMC article.
-
Transcription factor binding profiles reveal cyclic expression of human protein-coding genes and non-coding RNAs.PLoS Comput Biol. 2013;9(7):e1003132. doi: 10.1371/journal.pcbi.1003132. Epub 2013 Jul 11. PLoS Comput Biol. 2013. PMID: 23874175 Free PMC article.
-
Databases and prospects of dynamic gene regulation in eukaryotes: A mini review.Comput Struct Biotechnol J. 2023 Mar 22;21:2147-2159. doi: 10.1016/j.csbj.2023.03.032. eCollection 2023. Comput Struct Biotechnol J. 2023. PMID: 37013004 Free PMC article. Review.
-
Gene regulation of mammalian long non-coding RNA.Mol Genet Genomics. 2018 Feb;293(1):1-15. doi: 10.1007/s00438-017-1370-9. Epub 2017 Sep 11. Mol Genet Genomics. 2018. PMID: 28894972 Review.
References
LinkOut - more resources
Full Text Sources
Miscellaneous