Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 May 11:6:366.
doi: 10.1038/msb.2010.21.

Chromosomal periodicity and positional networks of genes in Escherichia coli

Affiliations

Chromosomal periodicity and positional networks of genes in Escherichia coli

Anthony Mathelier et al. Mol Syst Biol. .

Abstract

The structure of dynamic folds in microbial chromosomes is largely unknown. Here, we find that genes with a highly biased codon composition and characterizing a functional core in Escherichia coli K12 show to be periodically distributed along the arcs, suggesting an encoded three-dimensional genomic organization helping functional activities among which are translation and, possibly, transcription. This extends to functional classes of genes that are shown to systematically organize into two independent positional gene networks, one driven by metabolic genes and the other by genes involved in cellular processing and signaling. We conclude that functional reasons justify periodic gene organization. This finding generates new questions on evolutionary pressures imposed on the chromosome. Our methodological approach is based on single genome analysis. Given either core genes or genes organized in functional classes, we analyze the detailed distribution of distances between pairs of genes through a parameterized model based on signal processing and find that these groups of genes tend to be separated by a regular integral distance. The methodology can be applied to any set of genes and can be taken as a footprint for large-scale bacterial and archaeal analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1
Figure 1
Different models of the circular chromosome. Leading and lagging strands are described, with the arrows coding for the 5′–3′ direction. Chromosomal arcs have distinguished color (purple, brown). Scirc, Sall, Slagging, Sleading are the sets of distances between genes associated to the models.
Figure 2
Figure 2
Histograms and periodograms are computed on the distance distribution of E. coli K12 core genes (with the full arc-based model; B) and on the theoretical distribution (A). The original distance distribution has been symmetrized giving origin to an ‘isoscele triangle’ shape. Detrend is applied to the histogram series in the FFT analysis. E. coli K12 histogram is truncated at the central column; the maximum y value is at 706. Distances near zero are overrepresented, and the corresponding peak has been taken from the periodogram (this peak is on the plot for the ideal distribution). Notice that no periodic signal can be detected from the ideal distribution.
Figure 3
Figure 3
Periods and correlation between SCCI values and expression levels. (A) Plots of expression levels for wild type in log-phase growth (bottom, blue curve, values of 3982 genes) and SCCI values (top, red curve, values of 4295 genes) of E. coli K12 genes, along the chromosome, with a periodically spaced grid in solid lines of period 100 698 nt=3 × 33 566 nt and a grid in dashed points of period 33 566 nt. Many of the highest peaks in the transcription profile appear to fall near the 33-kb grid lines defined by core genes. Smoothed values (denoted SCCI* and Expression*) are used to plot the curves. Distance above the horizontal axis indicates increasing SCCI* and below the horizontal axis indicates increasing expression. (B) Local correlation is computed for the two curves in (A); chromosomal sectors are highlighted by thick lines and named with capital letters; sector A overlaps right and left arcs. (C) Chromosomal spiral of period 33 566 nt; strips are highlighted in violet along sectors. Most peaks of the SCCI* curve are located within these strips (Table I) and their narrow localization supports the visual effect of peaks matching the 33-kb grid in (A). (D) Sectors and strips collecting the strongest peaks of the SCCI* curve, that is peaks with a spectral value μ+σ, where μ and σ are the mean and the s.d. of the SCCI* distribution.
Figure 4
Figure 4
E. coli K12 genes arranged on left and right arcs of the chromosomal spiral (period 33 566 nt). (A) Genes are plotted in colors corresponding to intervals of the smoothed SCCI* curve. Only genes whose value is 1.1 s.d. above the mean (i.e. ⩾0.1964) of the SCCI* distribution are plotted (they correspond to the highest peaks in Figure 3A). For each gene, we expand the diameter of the dot corresponding to its position from the origin (O). (B) All core genes are plotted, with colors corresponding to smoothed values. (C, D) Distribution of all core genes on the period 33 566 nt (starting from the origin O) and identified on successive windows of 5500 nt along chromosomal arcs; smoothed curves (red) are computed with an s.d. of 2000 nt.
Figure 5
Figure 5
Functional distribution of genes on the chromosomal spiral. (A) Distribution of genes belonging to three main functional groups is computed by counting the number of genes located on the chromosomal spiral of period 33 566 nt (starting from the origin O) and on successive windows of 5500 nt along chromosomal arcs. The three functional groups are: information storage and processing (based on COGs classes A, B, J, K, L), cellular processing and signaling (D, M, N, O, P, T, U, V, W, Y, Z), and metabolism (C, E, F, G, H, I, Q). The distribution of core genes (red) is plot for comparison. Curves are smoothed with a Gaussian smoothing window with an s.d. of 2000 nt. (B) Curves (smoothed and non-smoothed) tracing chromosomal distribution of core genes along leading (top) and lagging strands (bottom) for left and right arcs. Two sharp periodic co-localizations of genes are observed on lagging strands, both on the right and on the left arc. The same clear signal appears on leading strands on the left arc.
Figure 6
Figure 6
Chromosomal gene mapping: curves (non-smoothed) tracing chromosomal distribution of core genes (A), all genes in COGs classes (B), and all E. coli K12 genes (C) along the entire chromosome with a period of 33 566 nt. The periodic co-localization of most core genes around a peak between 10 and 20 kb from the origin is clearly defined. Two peaks corresponding to the two identified networks are announced for COGs-classified genes, and clearly observed for all genes.

Similar articles

Cited by

References

    1. Allen TE, Herrgård MJ, Liu M, Qiu Y, Glasner JD, Blattner FR, Palsson BØ (2003) Genome-scale analysis of the uses of the Escherichia coli genome: model-driven analysis of heterogeneous data sets. J Bacteriol 185: 6392–6399 - PMC - PubMed
    1. Azam TA, Ishihama A (1999) Twelve species of the nucleoid-associated protein from Escherichia coli. Sequence recognition specificity and DNA binding affinity. J Biol Chem 274: 33105–33113 - PubMed
    1. Azam TA, Iwata A, Nishimura A, Ueda S, Ishihama A (1999) Growth phase-dependent variation in protein composition of the Escherichia coli nucleoid. J Bacteriol 181: 6361–6370 - PMC - PubMed
    1. Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H (2006) Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2: 2006.0008 - PMC - PubMed
    1. Balke VL, Gralla JD (1987) Changes in the linking number of DNA accompany growth transitions in Escherichia coli. J Bacteriol 169: 4499–4506 - PMC - PubMed

Publication types

MeSH terms

Substances