Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Sep 6;44(15):7109-19.
doi: 10.1093/nar/gkw647. Epub 2016 Jul 18.

Insertion sequence-caused large-scale rearrangements in the genome of Escherichia coli

Affiliations

Insertion sequence-caused large-scale rearrangements in the genome of Escherichia coli

Heewook Lee et al. Nucleic Acids Res. .

Abstract

A majority of large-scale bacterial genome rearrangements involve mobile genetic elements such as insertion sequence (IS) elements. Here we report novel insertions and excisions of IS elements and recombination between homologous IS elements identified in a large collection of Escherichia coli mutation accumulation lines by analysis of whole genome shotgun sequencing data. Based on 857 identified events (758 IS insertions, 98 recombinations and 1 excision), we estimate that the rate of IS insertion is 3.5 × 10(-4) insertions per genome per generation and the rate of IS homologous recombination is 4.5 × 10(-5) recombinations per genome per generation. These events are mostly contributed by the IS elements IS1, IS2, IS5 and IS186 Spatial analysis of new insertions suggest that transposition is biased to proximal insertions, and the length spectrum of IS-caused deletions is largely explained by local hopping. For any of the ISs studied there is no region of the circular genome that is favored or disfavored for new insertions but there are notable hotspots for deletions. Some elements have preferences for non-coding sequence or for the beginning and end of coding regions, largely explained by target site motifs. Interestingly, transposition and deletion rates remain constant across the wild-type and 12 mutant E. coli lines, each deficient in a distinct DNA repair pathway. Finally, we characterized the target sites of four IS families, confirming previous results and characterizing a highly specific pattern at IS186 target-sites, 5'-GGGG(N6/N7)CCCC-3'. We also detected 48 long deletions not involving IS elements.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Circos plot displaying large-scale rearrangements. Starting from the outer-most section to the inner-most section (each section separated by white ring), Circos plot displaying insertion sequence (IS) insertions in the founding strain, IS-associated deletions, novel IS insertions (each major IS family is drawn in an individual track) and other deletions (not associated with IS elements) recovered from mutation accumulation (MA) data. Colors indicate different IS families (IS5: yellow, IS1: cyan, IS2: red, IS186: green, IS3: blue, IS4: orange, IS150: purple, other deletions: white). A magenta band in the other deletion section indicates an e14 deletion that occurred 21 times and it is drawn as single band due to space limitation. Spatial clustering of IS-associated deletions anchored around preexisting IS insertions can be seen.
Figure 2.
Figure 2.
Cumulative distribution of IS insertions within a distance threshold d. Insertions recovered in MA data are shown in blue (observed) and average counts using the 1000-permutation test are shown in red (expected). IS1A and IS5A exhibit bias for nearby insertions (small values of d in bp on x-axis; drawn in log scale) and IS2 shows bias for distant insertions.
Figure 3.
Figure 3.
Size distribution of IS-mediated recombination events. Distribution of all deletions (E + N, E + E and N + N types) mediated by recombination are plotted except for five recombination events of sizes larger than 37 kb (48.3, 81.3, 81.3, 91.9 and 106.1 kb). The peak in the 11–12 kb bin is caused by recurrent deletion of a part of the CP4-6 cryptic prophage. Note that darker colors (dark purple and dark green) indicates overlap of distributions. Dark purple indicates an overlap of E + N and N + N and dark green indicates an overlap of E + N and E + E.
Figure 4.
Figure 4.
IS insertion and recombination rates remain constant across MA experiment. Each data point represents the total number of novel IS insertions (A) and recombinations (B) detected in all MA lines in a single MA experiment (y-axis) versus the the total number of generations of these MA lines (Table 3). A linear regression line for insertion rate (A) is shown with a slope of ∼3.5 × 10−4 (R2 = 0.93, p = 2.78 × 10−9) and a linear regression line for recombination rate (B) is shown with a slope of ∼4.5 × 10−5 (R2 = 0.67, p = 1.59 × 10−4).
Figure 5.
Figure 5.
Sequence logos of the reconstructed TSDs of IS186. Sequence logos with the 6 (A) and 7 (B) bps core are shown. The core sequences start at position 14 for both logos.

Similar articles

Cited by

References

    1. The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. - PMC - PubMed
    1. Varshney R.K., Nayak S.N., May G.D., Jackson S.A. Next-generation sequencing technologies and their implications for crop genetics and breeding. Trends Biotechnol. 2009;27:522–530. - PubMed
    1. Brockhurst M.A., Colegrave N., Rozen D.E. Next-generation sequencing as a tool to study microbial evolution. Mol. Ecol. 2011;20:972–980. - PubMed
    1. Casjens S. The diverse and dynamic structure of bacterial genomes. Annu. Rev. Genet. 1998;32:339–377. - PubMed
    1. Roth J.R., Benson N., Galitski T., Haack K., Lawrence J.G., Miesel L. Escherichia Coli and Salmonella: Cellular and Molecular Biology. Vol. 2. Washington D.C.: ASM Press; 1996. Rearrangements of the bacterial chromosome: formation and applications; pp. 2256–2276.

MeSH terms

Substances