. 2017 Nov;1861(11 Pt A):2789-2801.

doi: 10.1016/j.bbagen.2017.07.024. Epub 2017 Aug 1.

Using competition assays to quantitatively model cooperative binding by transcription factors and other ligands

Jacob Peacock¹, James B Jaynes²

Affiliations

¹ Dept. of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA 19107, United States.
² Dept. of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA 19107, United States. Electronic address: james.jaynes@jefferson.edu.

PMID: 28774855
PMCID: PMC5623634
DOI: 10.1016/j.bbagen.2017.07.024

Using competition assays to quantitatively model cooperative binding by transcription factors and other ligands

Jacob Peacock et al. Biochim Biophys Acta Gen Subj. 2017 Nov.

. 2017 Nov;1861(11 Pt A):2789-2801.

doi: 10.1016/j.bbagen.2017.07.024. Epub 2017 Aug 1.

Authors

Jacob Peacock¹, James B Jaynes²

Affiliations

¹ Dept. of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA 19107, United States.
² Dept. of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA 19107, United States. Electronic address: james.jaynes@jefferson.edu.

PMID: 28774855
PMCID: PMC5623634
DOI: 10.1016/j.bbagen.2017.07.024

Abstract

Background: The affinities of DNA binding proteins for target sites can be used to model the regulation of gene expression. These proteins can bind to DNA cooperatively, strongly impacting their affinity and specificity. However, current methods for measuring cooperativity do not provide the means to accurately predict binding behavior over a wide range of concentrations.

Methods: We use standard computational and mathematical methods, and develop novel methods as described in Results.

Results: We explore some complexities of cooperative binding, and develop an improved method for relating in vitro measurements to in vivo function, based on ternary complex formation. We derive expressions for the equilibria among the various complexes, and explore the limitations of binding experiments that model the system using a single parameter. We describe how to use single-ligand binding and ternary complex formation in tandem to determine parameters that have thermodynamic relevance. We develop an improved method for finding both single-ligand dissociation constants and concentrations simultaneously. We show how the cooperativity factor can be found when only one of the single-ligand dissociation constants can be measured.

Conclusions: The methods that we develop constitute an optimized approach to accurately model cooperative binding.

General significance: The expressions and methods we develop for modeling and analyzing DNA binding and cooperativity are applicable to most cases where multiple ligands bind to distinct sites on a common substrate. The parameters determined using these methods can be fed into models of higher-order cooperativity to increase their predictive power.

Keywords: Competition EMSA; Cooperative DNA binding; Curve fitting; Engrailed; Extradenticle-Homothorax; Finding dissociation constants; Hill plots; Modeling ligand-substrate interactions; Quantifying cooperativity.

PubMed Disclaimer

Figures

**Fig. 1. Barriers to determining cooperativity from Hill plots**
A: Model for cooperative binding to AB (substrate with two distinct binding sites) by ligand a. The ternary complex *AaBa* can dissociate in two ways, losing a from either the A or the B site first. Defining the Kd’s for the ternary complex as *K_A*/n and *K_B*/n reduces the number of variables, because, from the definitions of the Kd’s (below the line), *K_A* divided by *K_A*/n gives the same thing as *K_B* divided by *K_B*/n. B. Hill plots for two binding sites with the same or different Kd’s. From the model in A, the ratio (fractional occupancy)/(1 − fractional occupancy), which is (*K_A*+*K_B*+2n[a])[a]/{2K_AK_B+(*K_A*+*K_B*)[a]} (derived in Fig. 1A of Peacock and Jaynes [2]), was used to generate Hill plots. Concentration units (for Kd’s and [a]) are arbitrary. The case where *K_A* = *K_B* = 5 and n = 25.5 is shown as a solid blue curve, along with a tangent line (purple) at the point of maximum slope. Also shown are: the same equivalent sites, but with negative cooperativity (n = 0.04, dashed blue), and the case of two non-equivalent sites (*K_A* = 5, *K_B* = 500), either with positive cooperativity (n = 25.5, solid red) or with no cooperativity (n = 1, dashed red). Note the similarity in shape of the plots for equivalent sites with negative cooperativity and for non-equivalent sites without cooperativity. Also note that for two non-equivalent sites, when n approaches the value (*K_A*+*K_B*)^2/4K_AK_B (derived in Fig. 1A of Peacock and Jaynes [2]), the plot approaches a straight line of slope 1, which is indistinguishable from equivalent sites with no cooperativity. Thus, without prior knowledge that sites are equivalent, Hill plots are at best ambiguous for identifying cooperativity.

**Fig. 2. Two cooperating proteins binding to two different sites**
A: schematic of binding equilibria. Either protein can bind first. The upper path from left to right represents initial binding by protein b. The equilibrium concentration of the *ABb* single-protein complex is governed by its Kd, *K_B*. It can then bind protein a to form the ternary complex *AaBb*. The alternative pathway to the ternary complex is similarly diagrammed on the left. As the definitions of the various dissociation constants below show, not all 4 are independent. If we divide *K_B* by the Kd that governs the dissociation of b from *AaBb*, we get the same quantity that we get if we divide *K_A* by the Kd that governs the dissociation of a from *AaBb*. It is therefore convenient to define this ratio as the cooperativity factor n. B: graphs of [*AaBb*] as a function of increasing [a]_T, holding [b]_T constant. For all graphs, [AB]_T = 1, [b]_T = 2, and *K_B* = 500, while *K_A*’s and cooperativity factors vary. The apparent Kd (based on a single-site model, see Fig. 2B in Peacock and Jaynes [2]) is the [a] at the point of half-maximal [*AaBb*], which is marked by a dot for each curve. Note that the relative amounts of ternary complex depend strongly on [a]_T. The green curve, which has the lowest apparent Kd (0.68), actually shows the lowest [*AaBb*] at high [a]_T. This is due to its relatively low n, which determines the [*AaBb*] at saturation with protein a, independent of *K_A*. The black curve crosses the red curve, and also shows less binding at high [a]_T due to a lower n. The blue curve does not cross the red curve, and has a lower [*AaBb*] at all values of [a]_T, despite having a lower apparent Kd! So, a ranking of apparent Kd’s from this type of experiment is not predictive of relative ternary complex formation overall. Derivations of expressions relating [*AaBb*] to [a]_T (and to [a], [*AaB*], and [AB]) are given in Fig. 2A of Peacock and Jaynes [2]. Derivations of expressions for [*AaBb*]_max and for finding the apparent Kd are given in Fig. 2B of Peacock and Jaynes [2]. C: relative ternary complex formation can be qualitatively different depending on the fixed [b]_T chosen for the experiment. The two pairs of curves (upper and lower) represent [*AaBb*] formed on two sites (red and dark blue) as a function of increasing [a]_T, differing only in the fixed [b]_T. Note that at the lower [b]_T, the site represented by the dark blue curve (*K_A* = 5975, *K_B* = 400, n = 113) forms more ternary complex throughout most of the experimental range, while at the higher [b]_T, the site represented by the red curve (*K_A* = 90,000, *K_B* = 2808, n = 7000) forms more over the entire range. This illustrates another limitation of modeling cooperative binding using a single parameter. **NOTE:** Concentration units are not specified, because in all cases, these units (which includes the concentrations of ligands and substrate, as well as Kd’s) can be factored out of the governing equations, and do not affect the shapes of curves, or any of the conclusions.

**Fig. 3. Competition curves measuring ternary complex have limited predictive power**
Concentrations and Kd’s are in nM. A: Ternary complex as a function of competitor. For each curve, the [*AaBb*], labeled ternary complex, is graphed as a function of an unlabeled competitor. The lower (black) curve is self-competition by the high-affinity site (B1a [38]), where the unlabeled competitor, *U_AB*, is the same DNA sequence as the labeled binding site, AB. The other two curves show competition with two binding site oligos that have very different Kd’s and cooperativity factors (n), yet compete similarly for ternary complex formation by labeled B1a oligo. Note that self-competition is much more effective at all concentrations shown than is competition by either of the other oligos, while the other two oligos compete very similarly over a wide concentration range. B: Forms of competitor oligo in the competition experiment. Each graph shows the 3 bound forms for one of the competitor oligos. In each case, the solid line shows [*AaBb*], the dashed line shows [*AaB*], and the dotted line shows [*ABb*]. The upper panel shows the less cooperative low-affinity site (B1b, blue), the middle panel shows the more cooperative low-affinity site (A2a, red), and the lower panel shows the high-affinity site (B1a, black). The left and right sections of each curve show two different ranges of [a]_T, on two different scales. Note that at the higher concentrations of competitor, B1b forms mostly single-protein complexes, while A2a forms mostly ternary complex, reflecting its much higher cooperativity. At concentrations well beyond the range shown, all of each protein is incorporated into single-protein complexes, as the proteins are distributed over a vast excess of oligo. For oligo B1b (blue), we see the approach to this limit, while for oligo A2a (red), this approach is beyond the range shown. C: Concentrations of complexes as a function of [a]_T, holding [b]_T constant (no competitor). Using the same oligos as in A (and B), the concentrations of the various protein-containing forms are graphed for a similar experiment as in Fig. 2. The left and right sections of each curve show two different ranges of [a]_T, on two different scales. The top graph shows [*AaBb*] for each oligo, color-coded as in A and B. Note that despite the similarity of the blue and red competition curves in A, the oligo with the higher value of n (red) forms more ternary complex at all [a]_T, and the red curve approaches the black curve at high [a]_T. This provides a plausible explanation for the *in vivo* behavior of the binding sites represented by the blue and red curve: the one with the higher n (red) is more potent. It is more similar to the black curve than to the blue curve at high [a]_T, suggesting that the ability to form ternary complexes at high [a]_T may explain the relative functionality of these binding sites *in vivo*. The middle and bottom graphs show [*AaB*] (dashed) and [*ABb*] (dotted), respectively, for each oligo, color coded as above. As seen in the competition experiment in B, the less cooperative oligo forms more binary complexes (blue) than does the more cooperative oligo (red), especially *AaB* at high [a]_T, due to its having a lower Kd for binding each of the proteins. A similar phenomenon occurs at high concentrations of these oligos in the competition experiment: the less cooperative site sequesters more of each protein individually, while it forms less ternary complex than does the more cooperative site. These complexes are invisible in a competition assay, because the competitor oligo is unlabeled. For derivations of equations that can be used to generate these graphs, see Fig. 3A in Peacock and Jaynes [2]. For derivations of equations for graphing the total occupancy by each protein as a function of [a]_T, see Fig. 3B in Peacock and Jaynes [2].

**Fig. 4. Illustrations of curve-fitting equations and error comparison**
A: Families of competition curves with different values of [a]_T and *K_A*. Labeled binary complex, [Aa], is graphed as a function of increasing total unlabeled binding site, [*U_A*]_T, with constant amounts of both labeled binding site, [A]_T, and total ligand, [a]_T. The applicable formula is ${[U_{A}]}_{T} = {[A]}_{T} * ({[a]}_{T} / [A a] - K_{A} / ({[A]}_{T} - [A a]) - 1)$ . Sets of data points {([*U_A*]_T, [Aa])} along with known [A]_T can be used to find both *K_A* and [a]_T as parameters using freely available curve fitting software (see text). The values used here are: [A]_T = 2 for all curves; [a]_T = 6, 5.4, and 6.6, and *K_A* = {1.5, 5, 16.5}, {1.3, 4.4, 14.7}, {1.7, 5.6, 18.3}, for the black, red, and blue curves, respectively. The values of *K_A* were adjusted to give 3 sets of 3 curves each with the same 3 initial values (without competitor), 0.5, 1.0, and 1.5. Note that each set of curves with the same starting value diverges significantly as competitor increases. B: Performance of competition and saturation binding methods for simultaneously finding [a]_T and *K_A*. Monte Carlo analysis (100 trials are represented by each data point) of the accuracy of curve fitting to find both [a]_T and *K_A* as parameters was run with 15-point data sets at 7 different [A]_T using either competition (fixed [a]_T, varying [*U_A*]_T, as illustrated in A) or standard saturation binding (varying [a]_T, no competitor). Data sets were generated by introducing random errors into calculated values of [Aa]. These errors were randomly drawn from a normal distribution (centered on zero) such that 95% of the errors were within ±10% of the actual value (standard deviation = 5%, mean error = 4.0%, median error = 3.4%). Percent errors are shown in the values found for each parameter ([a]_T and *K_A*) using least-squares non-linear regression. These percent errors were ranked by increasing absolute value, and the 95^th largest (out of 100) plotted, with errors bars extending between the 90^th and 99^th largest. These error bars represent a 95% confidence interval for the true value of the 95^th error percentile, based on standard statistical analysis. Note that the best estimate for *K_A* is provided by the competition method at low [A]_T, which simultaneously provides a precise estimate for [a]_T. See text for further explanation. C: Family of curves with different values of n. [*AaBb*] is graphed as a function of [a]_T, holding constant [b]_T and [AB]_T, for 3 different values of the cooperativity factor (n). *K_A* = 3000, *K_B* = 600, n = {5, 50, 500} for the black curves, n = {4.5, 45, 450} for the red curves, and n = {5.5, 55, 550} for the blue curves. The uppermost black curve corresponds to the black curve in Fig. 3C, top. Once *K_A*, *K_B*, and [b]_T are determined, the formula used to draw these curves can be used to find n from sets of data points {([*AaBb*], [a]_T)} using freely available software (see text). The formula is ${[a]}_{T} = [AaBb] + [AaBb] {{[A B]}_{T} + K_{B} - {[b]}_{T} + Sqrt [({[A B]}_{T} + K_{B} - {[b]}_{T})^2 + 4 K_{B} ({[b]}_{T} - [AaBb] + [AaBb] / n)]} / {2 n ({[b]}_{T} - [AaBb] + [AaBb] / n)} + K_{A} [AaBb] {{[A B]}_{T} + K_{B} + {[b]}_{T} - 2 [AaBb] + Sqrt [({[A B]}_{T} + K_{B} - {[b]}_{T})^2 + 4 K_{B} ({[b]}_{T} - [AaBb] + [AaBb] / n)]} / {2 n (({[A B]}_{T} - [AaBb]) ({[b]}_{T} - [AaBb]) - K_{B} [AaBb] / n)}$ Note that the upper set of curves in the upper graph (which differ among themselves only by a change in n of 10%, like each set of 3 closely situated curves) are very close together, making it difficult to determine this n (= 500) using these values of [AB]_T and [b]_T. However, when both are reduced by a factor of 10 (as shown in the lower graph), the upper set of curves (again representing n = 500) diverge more. Thus, n values that result in saturation of the probe (AB) can be more precisely determined by reducing its concentration, along with that of the fixed [b]_T (which is optimal for determining n when it is similar to [AB]).

See this image and copyright information in PMC

References

1. Hill AV. The combinations of haemoglobin with oxygen and with carbon monoxide I. Biochem J. 1913;7:471–480. - PMC - PubMed
1. Peacock J, Jaynes JB. Mathematical toolkit for quantitative analysis of cooperative binding of two or more ligands to a substrate. MethodsX. submitted.
1. Wyman J, Gill SJ. Binding and Linkage: Functional Chemistry of Biological Macromolecules. University Science Books; 1990.
1. Weiss JN. The Hill equation revisited: uses and misuses. FASEB J. 1997;11:835–841. - PubMed
1. Stefan MI, Le Novère N. Cooperative Binding. PLoS Comp Biol. 2013;9:e1003106. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Using competition assays to quantitatively model cooperative binding by transcription factors and other ligands

Affiliations

Using competition assays to quantitatively model cooperative binding by transcription factors and other ligands

Authors

Affiliations

Abstract

Figures

Similar articles

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Figures

Similar articles

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources