Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 8;106(49):20830-5.
doi: 10.1073/pnas.0906681106. Epub 2009 Nov 23.

Use of high throughput sequencing to observe genome dynamics at a single cell level

Affiliations

Use of high throughput sequencing to observe genome dynamics at a single cell level

D Parkhomchuk et al. Proc Natl Acad Sci U S A. .

Abstract

With the development of high throughput sequencing technology, it becomes possible to directly analyze mutation distribution in a genome-wide fashion, dissociating mutation rate measurements from the traditional underlying assumptions. Here, we sequenced several genomes of Escherichia coli from colonies obtained after chemical mutagenesis and observed a strikingly nonrandom distribution of the induced mutations. These include long stretches of exclusively G to A or C to T transitions along the genome and orders of magnitude intra- and intergenomic differences in mutation density. Whereas most of these observations can be explained by the known features of enzymatic processes, the others could reflect stochasticity in the molecular processes at the single-cell level. Our results demonstrate how analysis of the molecular records left in the genomes of the descendants of an individual mutagenized cell allows for genome-scale observations of fixation and segregation of mutations, as well as recombination events, in the single genome of their progenitor.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Whole genome sequencing of individual colonies and mutation identification. (A) An example of genome-wide coverage. The sequence fragments obtained from the sequencing of DNA of an individual colony were aligned to the reference genome of the MG1655 strain. Abscissa, position along MG1655 genomic sequence. Ordinate, number of fragments per every 23,198 bp (1/200 of the genome). The positions of the OriC, Ter, and F' 128 episome are indicated. The F' 128 episome is a plasmid that contains the chromosomal proB-lacZ region and independent replication origin, hence the discontinuity in the otherwise smooth gradient of genome coverage. The coverage (and correspondingly the amount of DNA) at Ter was about twofold smaller than that at OriC for all genomes sequenced. Thus there is one pair of replication forks per cell, in average exponentially growing cell. (B) An example of identification of individual mutations. Reads of 23–32 bases were aligned with the reference sequence (parental CC102 strain). Differences with respect to the reference sequence are indicated by color. Noise is represented by colored positions that are unique to one sequence. Mutations, indicated by arrows, show a consistent difference throughout all reads (Top), or through a significant part of the reads (mixed positions, Bottom).
Fig. 2.
Fig. 2.
Unexpected features of the mutation distributions. (A) Example of G to A and C to T stretches. Shown is the 1294990–2135475 region of the E. coli K12 MG1655 genome. The positions of mutations (Left Column) and their type (Right Column) are indicated. Presented are mutations observed after sequencing the genome of colony H1. (B) Genome-wide distribution of mutation type. Results for 4 independent colonies are shown. The G → A, C → T and T → C transitions are indicated by open triangles, closed triangles and a circle, respectively. The “pure” or mixed state for every mutation is also indicated. The mutations with the 100% single nucleotide state are placed at the solid circle. The distance from the solid circle is proportional to the percent of wild-type state detected; 2 dashed lines show 50% wild-type state. Examples of mutation bunching (locations of increased mutation density that vary between different colonies) are indicated by “{”. (C) Genome-wide distribution of mutation density. Shown are the mutation densities obtained by averaging of data for the 6 genomes of the mutagenized CC102 strain sequenced. The genome was divided to 20 bins of 232 Kb size (coarse grained distribution, outer curve) or to 100 bins of 46 Kb size (fine grained, inner curve) and the mutation numbers in percent of total are plotted along the genome. Values closer to the center correspond to the regions of lowest mutation density. (D) Mutation bunching. The term bunching (and antibunching) is generally used to describe stochastic behavior which deviates from a random Poisson distribution, when successive events are not realized randomly but depend on neighboring events (36, 37). Such behavior is widely observed in diverse settings from photon counting experiments to statistics of neuron firings. The departure from the normal distribution can be quantified by the variance to mean ratio (VMR, Fano factor). Here the statistics of distances between successive mutations in experimental samples is compared with simulated random mutations. The VMR distribution for 20 (black) and 80 (gray) random mutations in the E. coli genome was obtained by simulating half a million randomized mutagenesis acts. The distribution of distances between random mutations is binomial; thus its VMR is less than one. The experimental VMR values for different samples are shown by arrows, where H1-H6 corresponds to the mutagenized CC102 strain, and R1-R3 to the mutagenized recA strain. All our samples fell into the right tail of the distribution, some of them displaying VMR values highly unlikely for random mutations (P value approximatley 1e−4).
Fig. 3.
Fig. 3.
Models and experimental verifications. (A) Asymmetric stretches. (Top) Scheme of the O6 alkyl guanine (O6-aG) specifically mis-pairing with thymine (T), which should result in G:C → A:T replacement after a second round of replication. (Bottom) Model of generation of asymmetric stretches. For simplicity, the original sequence is depicted as consisting of G and C only, each G being alkylated by EMS treatment. After the first replication round, 2 daughter strands are generated, both carrying T paired with O6-aG. After the second replication round, the DNA molecules with both newly synthesized strands carry exclusively either G → A (Left) or C → T (Right) replacements. Repair (for example, via removal of alkyl groups by methyltransferase MGMT) is also shown as conversion of “G*” back into “G.” (B) Asymmetric stretches in the RecA background. Genome-wide distribution of G → A (open triangle) and C → T (closed triangle) mutations for 3 colonies of recA mutant subjected to EMS treatment and processed as in Fig. 1A. Locations of increased mutation density that vary between different colonies are indicated by “{”. Mutations that are different from the G:C → A:T type are indicated by circle. Most of these positions have mixed state parameter r < 50%.
Fig. 4.
Fig. 4.
Role of competition between replication and repair in generation of mutations. (A) Starved cells. Genome-wide distribution of mutations in CC102 strain cells kept in a nonreplicating state (PBS) overnight after EMS treatment before plating. Three sequenced genomes are shown, with the mutation positions indicated as in Fig. 3B. (B) Model of competition between replication and repair. Most of the cells in exponential culture have the regions around the OriC replicated. When put in the nutrient-lacking medium after EMS treatment, the cells can complete replication of the area close to the terminus, thus giving a chance for the O6-aG in this area to be converted to mutations before repair. In contrast, the area around the OriC has less chance to replicate again and there the O6-aG are more likely to be repaired before replication, thus avoiding mutation fixation.

Similar articles

Cited by

References

    1. Eisenstadt E. In: Escherichia coli and Salmonella typhimurium. Cellular and Molecular Biology. Neidhardt FC, editor. Vol 2. Washington, DC: American Society for Microbiology; 1987. pp. 1016–1031.
    1. Benzer S. On the topography of the genetic fine structure. Proc Natl Acad Sci USA. 1961;47:403–415. - PMC - PubMed
    1. Cairns J, Overbaugh J, Miller S. The origin of mutants. Nature. 1988;335:142–145. - PubMed
    1. Foster PL. Mechanisms of stationary phase mutation: A decade of adaptive mutation. Annu Rev Genet. 1999;33:57–88. - PMC - PubMed
    1. Hall BG. Selection-induced mutations. Curr Opin Genet Dev. 1992;2:943–946. - PubMed

Publication types

LinkOut - more resources