Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Aug 28;104(35):13948-53.
doi: 10.1073/pnas.0700672104. Epub 2007 Aug 20.

How gene order is influenced by the biophysics of transcription regulation

Affiliations

How gene order is influenced by the biophysics of transcription regulation

Grigory Kolesov et al. Proc Natl Acad Sci U S A. .

Abstract

What are the forces that shape the structure of prokaryotic genomes: the order of genes, their proximity, and their orientation? Coregulation and coordinated horizontal gene transfer are believed to promote the proximity of functionally related genes and the formation of operons. However, forces that influence the structure of the genome beyond the level of a single operon remain unknown. Here, we show that the biophysical mechanism by which regulatory proteins search for their sites on DNA can impose constraints on genome structure. Using simulations, we demonstrate that rapid and reliable gene regulation requires that the transcription factor (TF) gene be close to the site on DNA the TF has to bind, thus promoting the colocalization of TF genes and their targets on the genome. We use parameters that have been measured in recent experiments to estimate the relevant length and times scales of this process and demonstrate that the search for a cognate site may be prohibitively slow if a TF has a low copy number and is not colocalized. We also analyze TFs and their sites in a number of bacterial genomes, confirm that they are colocalized significantly more often than expected, and show that this observation cannot be attributed to the pressure for coregulation or formation of selfish gene clusters, thus supporting the role of the biophysical constraint in shaping the structure of prokaryotic genomes. Our results demonstrate how spatial organization can influence timing and noise in gene expression.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
We propose the rapid search hypothesis as an explanation for colocalization of transcription factor genes and their targets and model the search process with hops, jumps, and slides. (A) The rapid search hypothesis. In prokaryotes, transcription and translation are coupled; therefore, transcription factors (TFs) are released from the ribosome near their encoding gene, enabling a TF to rapidly search the DNA nearby. The rapid search hypothesis suggests that TF genes and their binding sites may be colocalized on the chromosome because this enables newly synthesized TFs to rapidly find their binding sites. (B) Model of the transcription factor search process. We define three types of movements for TFs: slides, rounds of 1D diffusion along the DNA; hops, short rounds of 3D diffusion where the TF dissociates from the DNA and rebinds at a site very nearby; and jumps, longer rounds of 3D diffusion where the TF dissociates from the DNA and binds a site that may be quite far away. Mathematically, the dissociation and association sites of hops are correlated, whereas those of jumps are uncorrelated. We then model the search process as alternating rounds of 3D and 1D diffusion; the TF ends the slide with either a hop or a jump. (C) We find the hops are so short that they can be accounted for by rescaling the sliding length, s, the number of base pairs scanned in a slide by the number of hops per jump, nhops, to get se, the number of base pairs scanned in between jumps: se=s·nhops.
Fig. 2.
Fig. 2.
Simulations of the transcription factor search process show that its length depends on starting point. (A) Search time for a group of 10 TFs versus L. Here, we simulated a group of 10 TFs searching for a binding site and plot the mean search time, ts, of the first TF to reach the site versus initial distance L. Here, se = 660 bp, and 500 runs were simulated for each L. (B) The probability of fast runs. Here, the probability of a fast run, a run in which the TF starts near its binding site and finds it by hopping and sliding but without jumping, is plotted versus the initial distances between the TF and its site, L. The main plot shows L in base pairs, and the value of se=s·nhops ≈ 660 bp. (Inset) L in units of se and the different symbols correspond to different values of s (blue triangles, s = 270; green squares, s = 50; yellow circles, s = 100; red diamonds, s = 500). Each data point is the mean of 1,000 trials. The overlap in Inset shows that the behavior is parameter-independent when L is expressed in units of se, confirming that se is the only relevant parameter in this simulation. (C) Distribution of run times for fast and slow runs. The distribution of search times, ts is plotted for fast and slow runs, where fast runs are defined as above and slow runs are searched where the TF uses hopping, sliding, and jumping to find its binding site. The box has lines at the lower, median, and upper quartile values. The whiskers extend from the box to 1.5 times the interquartile range, the difference between the lower and upper quartiles. Data points beyond the whiskers are noted as circles. Each plot includes the data from 30,000 runs. This plot clearly shows that (i) fast runs are much faster than slow runs (fast runs have a median of 0.2 s, whereas slow runs have a median of 100 min) and (ii ) fast runs are much less variable than slow runs.
Fig. 3.
Fig. 3.
We show that local transcription factors are colocalized with their targets. (A) Possible orientations of a TF gene, its BS, and the regulated TU. In this diagram, the TF gene encodes a TF that regulates the expression of the TU by binding the BS. In our study, we aimed to determine the extent of colocalization that cannot be explained by co- or self-regulation. Therefore, we excluded the third orientation, because the TF and TU may be coregulated through a shared promoter region (coregulation), and the fourth orientation, because the TF may be part of the same operon as the TU (self-regulation). (B) Distances between local TFs and their binding sites. Here, TF–TU distances for local regulators are shown as bars, and the distances expected from random TF–TU assignments are shown by the blue line. Here, we can see that local TFs are significantly colocalized with their binding sites on length scales comparable with se ≈ 103 bps, suggesting that the rapid search hypothesis is feasible. (C) Distances between global TFs and their binding sites. Here again, TF–TU distances for global regulators are shown as bars, and the expected distribution for the random TF–TU assignment is shown by the blue line. For global TFs, there is no significant colocalization, suggesting that rapid search may be achieved by high copy number instead.
Fig. 4.
Fig. 4.
We test the selfish gene cluster hypothesis as a reason for colocalization. (A) Two considered orientations of a TF and its target TU: downstream unidirectional and convergent. Both orientations have the same recombination distance and thus are equally favored by the selfish gene cluster hypothesis. But, only downstream unidirectional orientation provides small travel distance for TF to find its site and therefore is favored by the rapid search hypothesis. Other TF–TU orientations were not considered as they can play a functional role to provide self/coregulation (see Fig. 3A). (B) The frequency of unidirectional (purple) versus convergent (gray) orientations of TF genes and their target TUs as a function of TF–TU distance. The observed prevalence of unidirectional TF–TU pairs cannot be explained by the selfish gene cluster hypothesis but provides evidence to support the rapid search mechanism. TF–TU distance was measured between the closest ends of the TF and the TU and thus does not depend on orientation or number of ORFs in the TU. Notice that both orientations are equally likely at larger separations (>5 kbp), ruling out an a priori bias toward the unidirectional orientation due to other factors. TF–TU pairs at distances <100 bp were neglected to exclude the possibility that TF–TU pairs belonging to the same operon or a read-through locus caused the prevalence of the unidirectional orientation. (C) Histogram of the number of TU–TU pair for two downstream TU orientations. This histogram includes data from Ecocyc for verified Escherichia coli Tus (operons). Here, we see that there is no general bias toward the unidirectional configuration in the genome (also see SI Fig. 7).
Fig. 5.
Fig. 5.
Examples of colocalized TF–TU pairs. TF genes are shown in red, regulated TU genes in green, negative regulation (repression) as blue blunt arrows, and positive regulation (activation) as green arrows, where dotted arrows represent weak regulation. In the first example, GalS represses the adjacent galactose transport operon mglBAC and itself, forming a negative-feedback loop. In the presence of galactose, GalS dissociates from its binding sites at the mglBAC and galS promoters, starting the transcription of these genes. However, the difference in DNA binding constants between GalS and the GalS–galactose complex is relatively small, on the order of 2 orders of magnitude. Experimental data suggest (45) that even in the presence of galactose, when GalS reaches a sufficiently large concentration, the repressor again shuts down the transcription of transporter genes encoded in mglBAC operon. It can be speculated that at the low expression levels typical for local regulators such as GalS and LacI, this is possible only in cases where a high local concentration of GalS is reached, making the release of the newly synthesized GalS protein near the 3′ end of galS gene and the 5′ end of mglBAC operon a key factor.

Similar articles

Cited by

References

    1. Pardee AB, Jacob F, Monod J. J Mol Biol. 1959;1:165–178.
    1. Warren PB, ten Wolde PR. J Mol Biol. 2004;342:1379–1390. - PubMed
    1. Korbel JO, Jensen LJ, von Mering C, Bork P. Nat Biotechnol. 2004;22:911–917. - PubMed
    1. Kepes F. J Mol Biol. 2004;340:957–964. - PubMed
    1. Hershberg R, Yeger-Lotem E, Margalit H. Trends Genet. 2005;21:138–142. - PubMed

Publication types