Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 May 19;24(3):bbad156.
doi: 10.1093/bib/bbad156.

A survey on algorithms to characterize transcription factor binding sites

Affiliations
Review

A survey on algorithms to characterize transcription factor binding sites

Manuel Tognon et al. Brief Bioinform. .

Abstract

Transcription factors (TFs) are key regulatory proteins that control the transcriptional rate of cells by binding short DNA sequences called transcription factor binding sites (TFBS) or motifs. Identifying and characterizing TFBS is fundamental to understanding the regulatory mechanisms governing the transcriptional state of cells. During the last decades, several experimental methods have been developed to recover DNA sequences containing TFBS. In parallel, computational methods have been proposed to discover and identify TFBS motifs based on these DNA sequences. This is one of the most widely investigated problems in bioinformatics and is referred to as the motif discovery problem. In this manuscript, we review classical and novel experimental and computational methods developed to discover and characterize TFBS motifs in DNA sequences, highlighting their advantages and drawbacks. We also discuss open challenges and future perspectives that could fill the remaining gaps in the field.

Keywords: motif discovery algorithms; motif models; transcription factors; transcription factors motif discovery.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Experimental and computational methods to discover TFBS and popular models to represent binding site motifs. Protein binding microarray (PBM), HT-SELEX and ChIP-seq have become the most popular assays to determine TF binding preferences and identify their target sites (TFBS) in recent years. Computational motif discovery methods can be grouped into five classes, based on the algorithms employed to discover TFBS: enumerative, alignment-based, probabilistic graphical model-based, SVM-based and DNN-based methods. TFBS sequences prioritized by motif discovery algorithms are encoded in computational models representing the binding preferences of the investigated TFs.

References

    1. Lambert SA, Jolma A, Campitelli LF, et al. The human transcription factors. Cell 2018;172:650–65. - PubMed
    1. Reimold AM, Iwakoshi NN, Manis J, et al. Plasma cell differentiation requires the transcription factor XBP-1. Nature 2001;412:300–7. - PubMed
    1. Lee TI, Young RA. Transcriptional regulation and its misregulation in disease. Cell 2013;152:1237–51. - PMC - PubMed
    1. Stewart AJ, Hannenhalli S, Plotkin JB. Why transcription factor binding sites are ten nucleotides long. Genetics 2012;192:973–85. - PMC - PubMed
    1. Whitfield TW, Wang J, Collins PJ, et al. Functional analysis of transcription factor binding sites in human promoters. Genome Biol 2012;13:R50. - PMC - PubMed

Publication types

Substances