Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Oct 9;109(41):E2774-83.
doi: 10.1073/pnas.1210309109. Epub 2012 Sep 18.

Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing

Affiliations

Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing

Heewook Lee et al. Proc Natl Acad Sci U S A. .

Abstract

Knowledge of the rate and nature of spontaneous mutation is fundamental to understanding evolutionary and molecular processes. In this report, we analyze spontaneous mutations accumulated over thousands of generations by wild-type Escherichia coli and a derivative defective in mismatch repair (MMR), the primary pathway for correcting replication errors. The major conclusions are (i) the mutation rate of a wild-type E. coli strain is ~1 × 10(-3) per genome per generation; (ii) mutations in the wild-type strain have the expected mutational bias for G:C > A:T mutations, but the bias changes to A:T > G:C mutations in the absence of MMR; (iii) during replication, A:T > G:C transitions preferentially occur with A templating the lagging strand and T templating the leading strand, whereas G:C > A:T transitions preferentially occur with C templating the lagging strand and G templating the leading strand; (iv) there is a strong bias for transition mutations to occur at 5'ApC3'/3'TpG5' sites (where bases 5'A and 3'T are mutated) and, to a lesser extent, at 5'GpC3'/3'CpG5' sites (where bases 5'G and 3'C are mutated); (v) although the rate of small (≤4 nt) insertions and deletions is high at repeat sequences, these events occur at only 1/10th the genomic rate of base-pair substitutions. MMR activity is genetically regulated, and bacteria isolated from nature often lack MMR capacity, suggesting that modulation of MMR can be adaptive. Thus, comparing results from the wild-type and MMR-defective strains may lead to a deeper understanding of factors that determine mutation rates and spectra, how these factors may differ among organisms, and how they may be shaped by environmental conditions.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Distribution of mutations among MA lines. The number of lines with a given number of mutations is plotted against the number of mutations per line and compared with the Poisson distribution expected for the mean number of mutations per line (solid traces). (A) Wild-type 3K lines (38 lines; triangles) and wild-type 6K lines (21 lines; diamonds). (B) MutL lines (34 lines).
Fig. 2.
Fig. 2.
Mutation rates of each of the six BPSs. Bars represent the mutation rate of each type of BPS normalized to the number of AT or GC base pairs in the genome. (A) Wild-type 3K dataset (93 mutations; blue) and wild-type 6K dataset (140 mutations; red). (B) MutL dataset (1,625 mutations). Error bars represent the fifth percentile and 95th percentile values from 1,000 Monte Carlo simulations of a random distribution with the mutational spectra observed for each dataset.
Fig. 3.
Fig. 3.
Indel formation at runs. (A) The relative mutation rate of indels in a run of a given length is plotted against the length of the run. The relative mutation rate of indels in each run length is the number of observed indels divided by the total number of target nucleotides (= nucleotides in the run × the number of runs of that length in the genome × the number of MA lines in the analysis). Blue diamonds represent the combined wild-type 3K and 6K datasets (21 indels); red diamonds represent the MutL dataset (306 indels). Lines are the least-squared fits to the data. (B) Values for the MutL dataset (306 indels). Blue bars represent the expected number of indels in a run of a given length calculated as the total number of indels obtained × the fraction of runs in the genome of that length. Red bars represent the actual number of indels observed in each run length.

References

    1. Drake JW. A constant rate of spontaneous mutation in DNA-based microbes. Proc Natl Acad Sci USA. 1991;88:7160–7164. - PMC - PubMed
    1. Ochman H. Neutral mutations and neutral substitutions in bacterial genomes. Mol Biol Evol. 2003;20:2091–2096. - PubMed
    1. Wielgoss S, et al. 2011. Mutation rate inferred from synonymous substitutions in a long-term evolution experiment with Escherichia coli. G3 (Bethesda) 1:183–186.
    1. Drake JW. 2012. Contrasting mutation rates from specific-locus and long-term mutation-accumulation procedures. G3 (Bethesda) 2:483–485.
    1. Keightley PD, Halligan DL. Analysis and implications of mutational variation. Genetica. 2009;136:359–369. - PubMed

Publication types

MeSH terms