Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 10;26(1):81.
doi: 10.1186/s12859-025-06094-4.

SNPeBoT: a tool for predicting transcription factor allele specific binding

Affiliations

SNPeBoT: a tool for predicting transcription factor allele specific binding

Patrick Gohl et al. BMC Bioinformatics. .

Abstract

Background: Mutations in non-coding regulatory regions of DNA may lead to disease through the disruption of transcription factor binding. However, our understanding of binding patterns of transcription factors and the effects that changes to their binding sites have on their action remains limited. To address this issue we trained a Deep learning model to predict the effects of Single Nucleotide Polymorphisms (SNP) on transcription factor binding. Allele specific binding (ASB) data from Chromatin Immunoprecipitation sequencing (ChIP-seq) experiments were paired with high sequence-identity DNA binding Domains assessed in Protein Binding Microarray (PBM) experiments. For each transcription factor a paired DNA binding Domain was selected from which we derived E-score profiles for reference and alternate DNA sequences of ASB events. A Convolutional Neural Network (CNN) was trained to predict whether these profiles were indicative of ASB gain/loss or no change in binding. 18211 E-score profiles from 113 transcription factors were split into train, validation and test data. We compared the performance of the trained model with other available platforms for predicting the effect of SNP on transcription factor binding. Our model demonstrated increased accuracy and ASB recall in comparison to the best scoring benchmark tools.

Conclusion: In this paper we present our model SNPeBoT (Single Nucleotide Polymorphism effect on Binding of Transcription Factors) in its standalone and web server form. The increased recovery and prediction accuracy of allele specific binding events could prove useful in discovering non-coding mutations relevant to disease.

Keywords: Gene regulation; Neural network; Transcription factor.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Conflict Of Interest: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Screenshot of the SNPeBoT website
Fig. 2
Fig. 2
Preprocessing stages of data. Each stage is labelled with the number of data points therein and color coded for the presence of classes (gain, loss and no-change). The distribution of classes for train, validation and testing data sets are (gain:2959, loss:2959, no-change:2959), (gain:129, loss:206, no-change:599), (gain:1140, loss:1822, no-change:5438)) respectively
Fig. 3
Fig. 3
Diagram of SNPeBoT’s CNN model architecture. Each layer in the model is labeled. The data shape at the output of each layer is shown either within the layer figure or immediately after
Fig. 4
Fig. 4
ASB Recall vs Accuracy for Benchmarking Tests of SNPeBoT (blue), FIMO/PWM approach (purple), MotifbreakR (green) and atSNP (red)
Fig. 5
Fig. 5
Barplot showing the Accuracy and ASB recall rate for SNPeBoT when test data is grouped along TF families, with 8 of the families shown here
Fig. 6
Fig. 6
Binary Receiver Operating Characteristic Curves for best scoring thresholds of all three tested tools. A) SNPeBoT B) atSNP C) motifbreakR D) The averaged curves all A-C displayed together

References

    1. Lettice LA, Williamson I, Devenney PS, Kilanowski F, Dorin J, Hill RE. Development of five digits is controlled by a bipartite long-range cis-regulator. Development. 2014;141(8):1715–25. 10.1242/dev.095430. - PMC - PubMed
    1. Engeland K. Cell cycle regulation: p53–p21-RB signaling. Cell Death Differ. 2022;29(5):946–60. 10.1038/s41418-022-00988-z. - PMC - PubMed
    1. Gertz J, Savic D, Varley KE, Partridge EC, Safi A, Jain P, Cooper GM, Reddy TE, Crawford GE, Myers RM. Distinct properties of cell-type-specific and shared transcription factor binding sites. Mol Cell. 2013;52(1):25–36. 10.1016/j.molcel.2013.08.037. - PMC - PubMed
    1. Cavalli M, Pan G, Nord H, Wallerman O, Wallén Arzt E, Berggren O, Elvers I, Eloranta ML, Rönnblom L, Lindblad Toh K, Wadelius C. Allele-specific transcription factor binding to common and rare variants associated with disease and gene expression. Hum Genet. 2016;135(5):485–97. 10.1007/s00439-016-1654-x. - PMC - PubMed
    1. Rheinbay E, Nielsen MM, Abascal F, Wala JA, Shapira O, Tiao G, Hornshøj H, Hess JM, Juul RI, Lin Z, Feuerbach L. Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Nature. 2020;578(7793):102–11. - PMC - PubMed

LinkOut - more resources