Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 May 26;10(5):e0127446.
doi: 10.1371/journal.pone.0127446. eCollection 2015.

Complete Genome Sequence of ER2796, a DNA Methyltransferase-Deficient Strain of Escherichia coli K-12

Affiliations

Complete Genome Sequence of ER2796, a DNA Methyltransferase-Deficient Strain of Escherichia coli K-12

Brian P Anton et al. PLoS One. .

Abstract

We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: Authors BA, EAR, DB, AF, and RJR are employed by New England Biolabs, a company that supplies competent E. coli cells as well as reagents that may be used in the construction of sequencing libraries. This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials.

Figures

Fig 1
Fig 1. Relationship of ER2796 and ER3413 to the nine other completely sequenced E. coli K-12 strains.
Completely sequenced strains are shown in bold type, and selected ancestral strains in Roman type. Most of the tree has been abstracted from Bachmann [28], except for the ancestries of MC4100 and DH10B back to Hfr Hayes, which are based on Laehnemann [63] and Durfee [4], respectively. Selected additional contributions of genetic material via crosses are shown by dotted lines. It appears based on genotype that Hfr 3000 U482 is the “U series” ancestor of DH10B, while Hfr 3000 U169 contributed genetic material in the ancestry of MC4100.
Fig 2
Fig 2. Lineage showing the construction of ER2796 from JC1552.
Selected intermediate genotypes are shown. Markers that were selected are shown, followed by those that were screened for in parentheses. The lineage from K-12 to JC1552 has been described previously [28]. Genotypes are shown here using the historic allele names, but we suggest an updated nomenclature for some of these in Table 3 based on the genome sequence.
Fig 3
Fig 3. Alignment of the MG1655 genome with ER2796 and DH10B, conducted with Progressive-Mauve.
Boundaries of the major contiguous blocks of sequence, labeled with capital letters, are formed by two major events specific to the DH10B lineage: block B results from deletion of a 34.6 kb region of MG1655 followed by partial restoration as part of a φ80Δ(lacZ)M15 mosaic prophage insertion in DH10B; and block E results from the IS10-mediated inversion of an 11 kb segment of MG1655, again in DH10B [4]. The following larger indels visible in the figure are labeled: prophage e14 lost in both ER2796 and DH10B; prophage CPZ-55 lost in ER2796; the 16 kb mtgA-yhcE region lost in ER2796 through IS5-mediated deletion; the ICR region deleted in both ER2796 and DH10B; Tn10 insertion at yedZ in ER2796; tandem duplication of a 113 kb region in DH10B, presumably IS5-mediated; the φ80Δ(lacZ)M15 mosaic prophage insertion in DH10B, including the lacZ region (part of block B).
Fig 4
Fig 4. Comparison of the lacZY regions of MG1655 and ER2796.
A. Schematic drawing showing the region of MG1655 lacZ and lacZY intergenic region that is deleted in ER2796. It is oriented forward with respect to the chromosomal sequence, with the operon reversed from the conventional representation. In ER2796, the lacZ ORF enodes amino acids 1–222 of MG1655 lacZ (white box) fused to 40 amino acids derived from the lacZY intergenic region, and overlapping with lacY (cross-hatched box). The putative lacY ribosome binding site (RBS) is preserved in ER2796. B. DNA and translated protein sequence of the lacZY junction, numbered from ER2796. Nucleotides and translated amino acids missing in ER2796 are shown in gray, and those present are shown in black. In ER2796, aa 1–222 of the translated ORF are shown in black, and the 40 aa derived from the intergenic region are shown in red. Start codons of lacZ and lacY are highlighted, and the putative RBS of lacY is underlined. 2160 bp (720 aa) of MG1655 lacZ have been removed at the indicated position for brevity.
Fig 5
Fig 5. Use of long reads to identify gene conversion events.
The schematic alignment shows the paralogous ribosomal gene clusters rrnB and rrnE from ER2796 (white genes) along with nonhomologous flanking genes (gray). The genes are marked with names and coordinates in ER2796. In ER2796, rrnB has been the apparent recipient of a gene conversion event in which rrnE served as donor (vertical arrows), and thus both regions are identical. As a result of this event, rrnB in ER2796 exhibits minor variations when compared with rrnB from its ancestor, MG1655: six SNPs (marked with *) and one indel (marked with †). Red tinted boxes indicate the regions of alteration (left and middle) and delineate the boundaries of the clusters (left and right). Sequencing reads internal to the clusters (i.e., between the outer two red boxes) cannot be mapped uniquely to one locus or the other unless they extend into the nonhomologous flanking regions, and the minor variants within (e.g., the middle red box) cannot be assigned to one cluster or the other without sequencing reads directly connecting them with a flanking region on one side or the other. The long-read library used in this analysis includes numerous reads that connect the unique flanking regions with the internal variants. The mapped coordinates of six example reads from the actual analysis are shown at the top, including some that span both sides of the 5 kb gene cluster. Arrows indicate where a read continues beyond the region shown here.

References

    1. Lederberg J, Tatum EL. Gene recombination in Escherichia coli. Nature. 1946;158(4016):558 Epub 1946/10/19. . - PubMed
    1. Blattner FR, Plunkett G 3rd, Bloch CA, Perna NT, Burland V, Riley M, et al. The complete genome sequence of Escherichia coli K-12. Science. 1997;277(5331):1453–62. Epub 1997/09/05. . - PubMed
    1. Hayashi K, Morooka N, Yamamoto Y, Fujita K, Isono K, Choi S, et al. Highly accurate genome sequences of Escherichia coli K-12 strains MG1655 and W3110. Molecular systems biology. 2006;2:2006 0007. Epub 2006/06/02. 10.1038/msb4100049 - DOI - PMC - PubMed
    1. Durfee T, Nelson R, Baldwin S, Plunkett G 3rd, Burland V, Mau B, et al. The complete genome sequence of Escherichia coli DH10B: insights into the biology of a laboratory workhorse. Journal of bacteriology. 2008;190(7):2597–606. Epub 2008/02/05. 10.1128/JB.01695-07 - DOI - PMC - PubMed
    1. Ferenci T, Zhou Z, Betteridge T, Ren Y, Liu Y, Feng L, et al. Genomic sequencing reveals regulatory mutations and recombinational events in the widely used MC4100 lineage of Escherichia coli K-12. Journal of bacteriology. 2009;191(12):4025–9. Epub 2009/04/21. 10.1128/JB.00118-09 - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

Associated data