Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Jan 9;34(1):e3.
doi: 10.1093/nar/gnj005.

A novel strategy for the identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites in closely related bacteria

Affiliations
Comparative Study

A novel strategy for the identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites in closely related bacteria

Hong-Yu Ou et al. Nucleic Acids Res. .

Abstract

We devised software tools to systematically investigate the contents and contexts of bacterial tRNA and tmRNA genes, which are known insertion hotspots for genomic islands (GIs). The strategy, based on MAUVE-facilitated multigenome comparisons, was used to examine 87 Escherichia coli MG1655 tRNA and tmRNA genes and their orthologues in E.coli EDL933, E.coli CFT073 and Shigella flexneri Sf301. Our approach identified 49 GIs occupying approximately 1.7 Mb that mapped to 18 tRNA genes, missing 2 but identifying a further 30 GIs as compared with Islander [Y. Mantri and K. P. Williams (2004), Nucleic Acids Res., 32, D55-D58]. All these GIs had many strain-specific CDS, anomalous GC contents and/or significant dinucleotide biases, consistent with foreign origins. Our analysis demonstrated marked conservation of sequences flanking both empty tRNA sites and tRNA-associated GIs across all four genomes. Remarkably, there were only 2 upstream and 5 downstream deletions adjacent to the 328 loci investigated. In silico PCR analysis based on conserved flanking regions was also used to interrogate hotspots in another eight completely or partially sequenced E.coli and Shigella genomes. The tools developed are ideal for the analysis of other bacterial species and will lead to in silico and experimental discovery of new genomic islands.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowchart depicting the tRNAcc high-throughput strategy developed and used to analyse the contents and contexts of tRNA genes in sequenced E.coli and Shigella genomes. Four stand-alone tools, indicated in bold italic font in the figure, were employed to identify islands (IdentifyIsland, TabulateIsland) and design primers (ExtractFlank, Primaclade) corresponding to the conserved upstream and downstream flanking regions of each tRNA site to be interrogated. See Table 1 for a summary of the programs features. In this study, four complete genomes were compared by the tRNAcc method: E.coli K-12 MG1655, E.coli UPEC CFT073, E.coli O157:H7 EDL933 and S.flexneri 2a Sf301. Four distinct genome subsets were analysed with the MG1655 genome being used as the reference template in each case. The numbers in the ovals above the word ‘tRNA’ indicate the number of tRNA genes still being considered at each stage in the analysis. The following abbreviations were used: UCB, upstream chromosomal block; DCB, downstream chromosomal block; GI, genomic island; UF, 2 kb upstream conserved flank; DF, 2 kb downstream conserved flank.
Figure 2
Figure 2
Schematic representation of a range of hypothetical tRNA site configurations present in the four complete genomes (MG1655, CFT073, EDL933 and Sf301) (af). The conserved UF and DF regions flanking tRNA genes are shown as dark grey filled boxes. UF and DF boxes drawn below the line indicate inversions with respect to the reference template MG1655 (c and d). The UF and DF boxes shown in pale grey with a broken outline represent deletions with respect to MG1655 (e and f). Genomic islands, where present, are indicated as broken boxes to emphasize the relatively large size of these regions. Arrowheads shown below each sub-figure indicate the location and orientation of primers specific to the UF and DF regions. Hollow arrowheads indicate the absence of matching complementary sequence. The solid line between the arrowheads shown in (a) indicates a likely successful in vitro PCR amplification; while the dotted line in (b) indicates a successful e-PCR-based ‘amplification’ that would typically yield a product of size far in excess of that that could be generated through standard in vitro PCR. The numbers shown above each configuration after the colon symbol represent the number of examples observed in the four genomes tested based on the 87 MG1655 tRNA genes and the total complement of orthologues present in the other three genomes (Table 2). The numbers of examples observed in the five unpublished genomes (E.coli EAEC O42, EPEC E2348/69, ETEC E24377A, E.coli HS and S.sonnei 53G), with respect to the subset of 20 tRNA genes only (Supplementary Table S3), are shown in parentheses. The symbols shown alongside the drawings are used in Table 2 and Supplementary Table S4 to highlight tRNA loci affected by inversions and/or deletions. Examples of the various atypical configurations observed in the four genomes are shown to the right. The figure is not drawn to scale.
Figure 3
Figure 3
The distribution of H-values corresponding to the island-borne CDS in E.coli K-12 MG1655, E.coli UPEC CFT073, E.coli O157:H7 EDL933 and S.flexneri 2a Sf301 identified by tRNAcc and/or Islander methods. This homology score had been proposed by Fukiya et al. (18) and reflected the degree of similarity between the matching reference genome sequence and the CDS itself in terms of the length of match and the degree of identity at a DNA level. See Supplementary Data for details. Red, green, blue, cyan and magenta bars represent total CDS, CFT073 CDS, EDL933 CDS, MG1655 CDS and Sf301 CDS, respectively. Note that each CDS in a given genome has three H-values that were obtained by BLASTN searches against the other three genomes in turn.
Figure 4
Figure 4
Negative cumulative GC profile (23) highlighting the genomic context of islands identified by tRNAcc in E.coli O157:H7 EDL933. A sharp upward spike in the negative cumulative GC profile indicates a relatively sharp increase in GC content, whereas an abrupt fall indicates a relatively sharp decrease in GC content. The locations of tRNA-associated genomic islands are shown in green and the tRNA and tmRNA genes are represented as blue diamonds. Details of this plot are specified in the supplementary material.
Figure 5
Figure 5
The four pheV-borne islands in MG1655 (a), CFT073 (b), Sf301 (c) and EDL933 (d) genomes identified by the tRNAcc method. The 9.1 kb island in MG1655, 127.9 kb island in CFT073 and 55.1 kb island in Sf301 are flanked by conserved upstream (UF) and downstream (DF) backbone segments. However, the DF region, common to the other three genomes, is absent in EDL933. Instead, the first instance of a conserved chromosomal block common to the other genomes occurs 23.5 kb downstream of the EDL933 pheV gene. This secondary conserved block has been designated as DF′. Matching 2 kb flanking regions are represented as connected blocks. In this study, genomic island-like regions were defined as anomalous segments between the 3′ end of tRNA genes and the 5′ end of the conserved downstream flank. Consequently, the tRNAcc-identified GIs in MG1655, CFT073 and Sf301 lay between the pheV and DF loci, while that in EDL933 was defined as the segment between the pheV gene and the proximal boundary of the DF′ conserved segment. The Islander-defined 104.6 and 46.7 kb islands at the pheV locus in CFT073 and Sf301, respectively, are shown as red lines flanked by DR sequences (red rectangles).

References

    1. Perna N.T., Plunkett G., III, Burland V., Mau B., Glasner J.D., Rose D.J., Mayhew G.F., Evans P.S., Gregor J., Kirkpatrick H.A., et al. Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature. 2001;409:529–533. - PubMed
    1. Hacker J., Kaper J.B. Pathogenicity islands and the evolution of microbes. Annu. Rev. Microbiol. 2000;54:641–679. - PubMed
    1. Hou Y.M. Transfer RNAs and pathogenicity islands. Trends Biochem. Sci. 1999;24:295–298. - PubMed
    1. Williams K.P. Integration sites for genetic elements in prokaryotic tRNA and tmRNA genes: sublocation preference of integrase subfamilies. Nucleic Acids Res. 2002;30:866–875. - PMC - PubMed
    1. Simillion C., Vandepoele K., Van de Peer Y. Recent developments in computational approaches for uncovering genomic homology. Bioessays. 2004;26:1225–1235. - PubMed

Publication types

LinkOut - more resources