Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jul;37(12):4076-88.
doi: 10.1093/nar/gkp289. Epub 2009 May 8.

Experimentally based contact energies decode interactions responsible for protein-DNA affinity and the role of molecular waters at the binding interface

Affiliations

Experimentally based contact energies decode interactions responsible for protein-DNA affinity and the role of molecular waters at the binding interface

N Alpay Temiz et al. Nucleic Acids Res. 2009 Jul.

Abstract

A major obstacle towards understanding the molecular basis of transcriptional regulation is the lack of a recognition code for protein-DNA interactions. Using high-quality crystal structures and binding data on the promiscuous family of C(2)H(2) zinc fingers (ZF), we decode 10 fundamental specific interactions responsible for protein-DNA recognition. The interactions include five hydrogen bond types, three atomic desolvation penalties, a favorable non-polar energy, and a novel water accessibility factor. We apply this code to three large datasets containing a total of 89 C(2)H(2) transcription factor (TF) mutants on the three ZFs of EGR. Guided by molecular dynamics simulations of individual ZFs, we map the interactions into homology models that embody all feasible intra- and intermolecular bonds, selecting for each sequence the structure with the lowest free energy. These interactions reproduce the change in affinity of 35 mutants of finger I (R(2) = 0.998), 23 mutants of finger II (R(2) = 0.96) and 31 finger III human domains (R(2) = 0.94). Our findings reveal recognition rules that depend on DNA sequence/structure, molecular water at the interface and induced fit of the C(2)H(2) TFs. Collectively, our method provides the first robust framework to decode the molecular basis of TFs binding to DNA.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Structure of EGR complexed with its consensus site. (A) EGR is colored yellow and DNA is colored pink. Two exposed side chains at the binding sites of fingers I and III are shown as blue spheres. Buried key arginines are shown as cyan spheres. (B) Binding mode of FI of EGR. Hydrogen bonds are showed as pink dashed lines. (C) Diagram of interaction network of FI. Arrows indicate H-bonds. Colors correspond to a classification scheme detailed in Table 2. Black arrows indicate intramolecular H-bonds, those drawn above/below protein sequence correspond to sc–bb/sc–sc bonds.
Figure 2.
Figure 2.
Crystal structures of binding modes and induced fit on ZFs. (A) WT (RDER) EGR with GCG site (blue). (B) QGSR mutant with GCA site (yellow). (C) DSNR mutant with GAC site (pink). (D) WT with mutant GCA site (green). (E) RADR mutant with GCG site (purple). Hydrogen bonds between the side chains and the bases are showed as dashed lines. (F) Superimposition of the α-helices of the four modes after aligning DNA–bb's. Note that α-helices of Q and D modes are closer to DNA than WT mode.
Figure 3.
Figure 3.
Sketches illustrating atom desolvation penalties and solvation effects at the protein (top)–DNA (bottom) binding interface. H-bonds are indicated as dashed lines and filled spheres correspond to water molecules. (A) From left to right, side chain oxygen (δO) and NH2NH2) desolvation penalties arise when side chain atoms do not form a H-bond with protein or DNA. Intramolecular H-bond desolvation penalty (δHB) is assessed when oxygen groups are left unmatched. (B) Effect of solvation on the strength of intermolecular H-bonds. Default binding interface with ϵ as the effective H-bond strength (center). The cartoon also reflects the fact that bonds required a surface to lay on. Solvated binding interface (left). Competing water molecules are weakening the intermolecular H-bond by a factor of λw. Desolvated binding interface increases H-bond strength by a factor of 1/(1 − λw) (right).
Figure 4.
Figure 4.
Predicted complex structures for six EGR FI and six DNA-binding site sequences. Arrows indicate H-bonds, and dashed arrows denote H-bonds to backbone phosphates. Intramolecular H-bonds are indicated by black arrows/lines. Blue spheres show the desolvation penalties for side chain oxygens (δO). Orange spheres show the desolvation penalty for intramolecular H-bonds (δHB). Rectangles are the desolvation penalties for NH2 groups (δNH2). Filled/open triangles point to the interaction that is been solvated/desolvated at the binding interface. The numbers on the left of each model indicate the experimental (black) and predicted (red) change in affinity with respect to RDER/GCG WT structure shown in upper-left corner. Predictions can easily be reproduced by decoding interactions using Table 2 and Equation (4). All complexes are built on top of the WT FI crystal, unless shown inside a rectangle. Red/green/magenta rectangles denote those complexes whose homology models were superimposed to Q/BB1/BB2 binding modes, respectively.
Figure 5.
Figure 5.
Rearrangement of waters at the protein–DNA interface due to cytosine to adenine mutation. FI of EGR is shown in dark blue. Green ball and sticks show crystal GCG triplet. Mutated A0 is shown as pink ball and sticks. Cyan spheres are the waters at the interface found in the crystal of WT (GCG) complex. Pink spheres are modeled extra waters at the interface of EGR FI–GAG complex. Note the shift in the base due to C→A mutation allowing waters to fit in.
Figure 6.
Figure 6.
Predicted ΔΔGCalc versus experimental ΔΔGExp changes in free energy due to protein and/or DNA mutations. ΔΔGs are computed using Equation (2). Solid line corresponds the y = x line. Since interaction code is predicted based on experiments, the same error bars apply to both.
Figure 7.
Figure 7.
Predicted complex structures for FII mutants. Symbols are the same as in Figure 4 and Table 2. Homology models built on D binding mode are indicated by a yellow rectangle. WT is indicated in upper-left corner.
Figure 8.
Figure 8.
Predicted ΔΔGCalc versus experimental ΔΔGExp changes in free energy due to protein and/or DNA mutations for (A) FII (15) and (B) FIII (19). As expected, minimum energy models typically resulted in an upper bound of ΔΔGExp, suggesting the possibility of yet more subtle models or solvent conditions for some sequences.
Figure 9.
Figure 9.
Predicted complex structures for FIII experiments. Symbols are the same as in Figure 4 and Table 2. Plus signs show desolvation of hydrophobic groups (δNP). Purple spheres show the desolvation penalty for N+5N+ 5). WT is in upper-left corner.

Similar articles

Cited by

References

    1. Bulyk M. Computational prediction of transcription-factor binding site locations. Gen. Biol. 2003;5:201. - PMC - PubMed
    1. GuhaThakurta D. Computational identification of transcriptional regulatory elements in DNA sequence. Nucleic Acids Res. 2006;34:3585–3598. - PMC - PubMed
    1. Siggia ED. Computational methods for transcriptional regulation. Curr. Opin. Genet. Dev. 2005;15:214–221. - PubMed
    1. Ladomery M, Dellaire G. Multifunctional zinc finger proteins in development and disease. Ann. Hum. Genet. 2002;66:331–342. - PubMed
    1. Benos PV, Lapedes AS, Stormo GD. Probabilistic code for DNA recognition by proteins of the EGR family. J. Mol. Biol. 2002;323:701–727. - PubMed

Publication types