Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Dec 2;111(48):17140-5.
doi: 10.1073/pnas.1410569111. Epub 2014 Oct 13.

Protein-DNA binding in the absence of specific base-pair recognition

Affiliations

Protein-DNA binding in the absence of specific base-pair recognition

Ariel Afek et al. Proc Natl Acad Sci U S A. .

Abstract

Until now, it has been reasonably assumed that specific base-pair recognition is the only mechanism controlling the specificity of transcription factor (TF)-DNA binding. Contrary to this assumption, here we show that nonspecific DNA sequences possessing certain repeat symmetries, when present outside of specific TF binding sites (TFBSs), statistically control TF-DNA binding preferences. We used high-throughput protein-DNA binding assays to measure the binding levels and free energies of binding for several human TFs to tens of thousands of short DNA sequences with varying repeat symmetries. Based on statistical mechanics modeling, we identify a new protein-DNA binding mechanism induced by DNA sequence symmetry in the absence of specific base-pair recognition, and experimentally demonstrate that this mechanism indeed governs protein-DNA binding preferences.

Keywords: nonspecific protein−DNA binding; protein−DNA binding; transcriptional regulation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Examples of specific protein−DNA binding, involving proteins used in this study. Crystal structures of specific protein−DNA complexes formed by proteins from the two structural families explored in this work: bHLH family (Max) and E2F/DP family (E2F4:Dp2).
Fig. 2.
Fig. 2.
Direct experimental test of nonconsensus protein−DNA binding in the absence of specific base-pair recognition, using computationally designed DNA sequences with identical specific motifs, identical nucleotide content, and flanking regions with different nonconsensus sequence elements. (A) Examples of computationally designed 36-mer DNA sequences possessing different sequence repeat symmetries. The sequences shown in the example were generated at 1/ξ=0.3 (in dimensionless units). (B) Measured binding preferences of the Max protein toward designed DNA sequences characterized by different symmetries and length scales of DNA sequence correlations. The legend shows the symmetry types, where α represents A, T, C, or G. For each symmetry type, each point on the corresponding curve represents the measured average PBM fluorescent intensity (representing the average concentration of the Max protein bound to DNA) over a few hundred DNA sequences designed at a given value of ξ. For example, to obtain one point at 1/ξ=0.3, for the [αNα] symmetry, we used the measured PBM intensity from 404 different DNA sequences designed at this value of ξ. To compute error bars, for each value of ξ, we divided measured fluorescent intensities into four groups, computed the average intensity for each group, and computed the SD of the average intensities. The color code of the sequences corresponds to the color code used in A. The fluorescent intensity is given in dimensionless units; higher values correspond to stronger binding events. (C) Computed nonconsensus free energy per bp, f=FTF/M, (in units of kBT), shows a strong, negative linear correlation with the measured Max−DNA binding preferences. (D) Probability distribution of the measured Max−DNA binding intensities from the entire designed DNA library shows a large variability in the strength of Max binding to DNA sequences with identical specific motifs.
Fig. 3.
Fig. 3.
Measured free energy of nonconsensus protein−DNA binding specifies the average, statistical strength of the nonconsensus effect. Figure shows the measured nonconsensus protein−DNA binding free-energy difference, ΔΔG=ΔG(ξ)ΔGrand, where ΔGrand is the free energy of protein binding to random DNA sequences, and ΔG represents free energy for sequences designed with different correlation scales, ξ, and different DNA sequence symmetries. All DNA sequences have identical GC content, and an identical specific binding motif in the center of each sequence.
Fig. 4.
Fig. 4.
Genomic nonconsensus DNA sequence elements surrounding specific TF−DNA binding motifs significantly influence TF−DNA binding preferences. These examples show the results for human transcription factors belonging to the bHLH family: (A) Mad, (B) Max, and (C) Myc; and the E2F/DP family: (D) E2F1, (E) E2F4, and (F) Dp1. Each plot shows the correlation between the computed free energy of nonconsensus protein−DNA binding per bp, f (in units of kBT), and the bound protein occupancy measured experimentally (by gcPBM). The x axis represents the logarithm of the measured gcPBM signal intensity (in dimensionless units; higher values correspond to stronger binding events). The data are binned into 25 bins. In A, B, and C, we separated the 36-bp-long genomic DNA sequences used in the experiment into two groups: sequences with the exact specific motif, CACGTG (red) and with the mutated motif (black). Overall, TF−DNA binding was probed for 18,123 DNA sequences in A, 16,421 sequences in B, 15,936 sequences in C, and 5,329 sequences in D, E, and F. Below each plot are examples of sequences used in these experiments; specific motifs are marked in red.

Comment in

References

    1. von Hippel PH, Revzin A, Gross CA, Wang AC. Non-specific DNA binding of genome regulating proteins as a biological control mechanism: I. The lac operon: Equilibrium aspects. Proc Natl Acad Sci USA. 1974;71(12):4808–4812. - PMC - PubMed
    1. Riggs AD, Bourgeois S, Cohn M. The lac repressor-operator interaction. 3. Kinetic studies. J Mol Biol. 1970;53(3):401–417. - PubMed
    1. Berg OG, Winter RB, von Hippel PH. Diffusion-driven mechanisms of protein translocation on nucleic acids. 1. Models and theory. Biochemistry. 1981;20(24):6929–6948. - PubMed
    1. von Hippel PH, Berg OG. Facilitated target location in biological systems. J Biol Chem. 1989;264(2):675–678. - PubMed
    1. Berg OG, von Hippel PH. Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. J Mol Biol. 1987;193(4):723–750. - PubMed

Publication types

MeSH terms

Associated data