Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 1;10(3):918-927.
doi: 10.1093/gbe/evy044.

The Prevalence and Evolutionary Conservation of Inverted Repeats in Proteobacteria

Affiliations

The Prevalence and Evolutionary Conservation of Inverted Repeats in Proteobacteria

Bar Lavi et al. Genome Biol Evol. .

Abstract

Perfect short inverted repeats (IRs) are known to be enriched in a variety of bacterial and eukaryotic genomes. Currently, it is unclear whether perfect IRs are conserved over evolutionary time scales. In this study, we aimed to characterize the prevalence and evolutionary conservation of IRs across 20 proteobacterial strains. We first identified IRs in Escherichia coli K-12 substr MG1655 and showed that they are overabundant. We next aimed to test whether this overabundance is reflected in the conservation of IRs over evolutionary time scales. To this end, for each perfect IR identified in E. coli MG1655, we collected orthologous sequences from related proteobacterial genomes. We next quantified the evolutionary conservation of these IRs, that is, the presence of the exact same IR across orthologous regions. We observed high conservation of perfect IRs: out of the 234 examined orthologous regions, 145 were more conserved than expected, which is statistically significant even after correcting for multiple testing. Our results together with previous experimental findings support a model in which imperfect IRs are corrected to perfect IRs in a preferential manner via a template switching mechanism.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
—Template switching converts an imperfect IR to a perfect one. (A) The upper and lower sequences represent a perfect and an imperfect IR, respectively, located in an orthologous locus in two genomes. (B) The first switch under intramolecular template switching. Here the nascent strand is used as template. (C) The first switch under intermolecular template switching. Here the strand across the fork is used as template. (D) The second switch returns the nascent strand into the original template, resulting in a perfect IR as represented by the upper sequence in A. Upper case letters represent the IR arms while red dots represent mismatches between the arms. The noncanonical template is marked with a red line. The direction of the replication fork is indicated with an arrow.
<sc>Fig</sc>. 2.
Fig. 2.
—Characteristics of detected IRs in the MG1655 genome. (A) The proportion of regions in the NC collection for which a certain number of IRs was detected. (B) The number of detected IRs in a region as a function of the length of the region. (C) The proportion of regions in a collection for which a certain ratio of IR base-pairs to total number of region base-pairs was detected. (D) The proportion of IRs for which a certain arm length was detected.
<sc>Fig</sc>. 3.
Fig. 3.
—Detected IRs in the entire MG1655 genome compared to simulations. The total number of IRs in the MG1655 NC regions (bold dashed line) compared to the total number of IRs in each of its corresponding null collections.
<sc>Fig</sc>. 4.
Fig. 4.
—Conservation analysis. (A) An example alignment of an IR and its mapping onto its corresponding phylogenetic tree, with the IR of the MG1655 as the root sequence. The IR conservation score is 7/11, since 7 out of 11 sequences are identical to the root IR. (B) Conservation score computation for the entire NC region, located between the ldtB and the yblT genes in the MG1655 genome, which contains three additional IRs. (C) Analysis of conservation of IRs in the region located between the ldtB and the yblT genes. The distribution of 1,000 conservation scores computed using simulated data is shown in blue and the conservation score value computed from real data is shown as a bold dashed line. The detailed template switching mechanism that can explain this example is shown in fig 1.
<sc>Fig</sc>. 5.
Fig. 5.
—Comparison of conservation significance between IR regions and control regions. An empirical P-value was computed for each region based on its 1,000 corresponding simulated data sets. Shown in blue are the P-values for the IR regions and in green for the control regions.
<sc>Fig</sc>. 6.
Fig. 6.
—Two main processes (template switching in red and the formation of substitutions and indels in black) are dictating the dynamics of short IRs. In this scheme there are three sequence states and the transition between them occurs due to the two processes. Black arrows indicate substitutions and indels, red arrows indicate template switching. The width of the arrows indicates the rate of each process.

Similar articles

Cited by

References

    1. Aris-Brosou S, Rodrigue N, Anisimova M.. 2012. The essentials of computational molecular evolution. Methods Mol Biol. 855:111–152. - PubMed
    1. Bissler JJ. 1998. DNA inverted repeats and human disease. Front Biosci. 3(4):d408–d418. - PubMed
    1. Blattner FR,, et al. 1997. The complete genome sequence of Escherichia coli K-12. Science 277(5331):1453–1462. - PubMed
    1. Branzei D, Foiani M.. 2010. Leaping forks at inverted repeats. Genes Dev. 24(1):5–9. - PMC - PubMed
    1. Brewer BJ, Payen C, Raghuraman MK, Dunham MJ.. 2011. Origin-dependent inverted-repeat amplification : a replication-based model for generating palindromic amplicons. PLoS Genet. 7(3):e1002016.. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources