Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 24;20(10):e1011459.
doi: 10.1371/journal.pgen.1011459. eCollection 2024 Oct.

Evolution of the pheV-tRNA integrated genomic island in Escherichia coli

Affiliations

Evolution of the pheV-tRNA integrated genomic island in Escherichia coli

Nguyen Thi Khanh Nhu et al. PLoS Genet. .

Abstract

Escherichia coli exhibit extensive genetic diversity at the genome level, particularly within their accessory genome. The tRNA integrated genomic islands (GIs), a part of the E. coli accessory genome, play an important role in pathogenicity. However, studies examining the evolution of GIs have been challenging due to their large size, considerable gene content variation and fragmented assembly in draft genomes. Here we examined the evolution of the GI integrated at pheV-tRNA (GI-pheV), with a primary focus on uropathogenic E. coli (UPEC) and the globally disseminated multidrug resistant ST131 clone. We show the gene content of GI-pheV is highly diverse and arranged in a modular configuration, with the P4 integrase encoding gene intP4 the only conserved gene. Despite this diversity, the GI-pheV gene content displayed conserved features among strains from the same pathotype. In ST131, GI-pheV corresponding to the reference strain EC958 (EC958_GI-pheV) was found in ~90% of strains. Phylogenetic analyses suggested that GI-pheV in ST131 has evolved together with the core genome, with the loss/gain of specific modules (or the entire GI) linked to strain specific events. Overall, we show GI-pheV exhibits a dynamic evolutionary pathway, in which modules and genes have evolved through multiple events including insertions, deletions and recombination.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Prevalence of GI-pheV in the 2,382 E. coli complete genomes from the RefSeq Database.
The total number of E. coli genomes in each phylogroup is shown in brackets.
Fig 2
Fig 2. Phylogeny of intP4 in the 40 GI-pheV from the reference set of 66 E. coli complete genomes.
A phylogenetic tree of 40 intP4 nucleotide sequences (except SMS-3-5_intP4) was constructed using RaxML, with 1,000 bootstraps, and the scale as the number of SNPs. The intP4 gene clustered into 5 main alleles, namely intP4.1intP4.5 (coloured across each group for clarity). Highly supported branches are marked with one (*) or two (**) asterisks based on its bootstrapped value (>50% and >90%, respectively). E. coli phylogroups, pathotypes and the 3’ end direct repeat (attR) of GI-pheV are shown on the right of the intP4 phylogeny (indicated by black boxes). Strain types: non-path, non-pathogenic; EPEC, Enteropathogenic E. coli; EHEC, Enterohemorrhagic E. coli; ETEC, Enterotoxigenic E. coli; EAEC, Enteroaggregative E. coli; UPEC, Uropathogenic E. coli; NMEC, neonatal meningitis associated E. coli; APEC, Avian pathogenic E. coli.
Fig 3
Fig 3. Gene content of intP4.1_GI-pheV.
Gene content of intP4.1_GI-pheV visualized based on an alignment of the entire GI using Easyfig (drawn to scale). The black to grey gradient between the GI indicates their pairwise nucleotide conservation. Conserved genes and gene modules are colour coded as indicated in the text. Insertion sequences (transposons, transposases and other mobile elements) are drawn as black arrows.
Fig 4
Fig 4. Phylogenetic relationship of intP4.1_GI-pheV and its modules.
Whole GI-pheV or gene modules were aligned using Mauve or ClustalO, respectively. Phylogenetic trees were constructed using RAxML, with 1,000 bootstraps. Highly supported branches are shown with bootstrapped values and scales indicating the number of SNPs. Phylogenetic trees depicted as follows: (A) nine intP4.1_GI-pheV (phylogenetic tree constructed based on a 30,859 bp—core GI-pheV). (B) eight intP4.1_GI-pheV (phylogenetic tree excluding CE10_GI-pheV, constructed based on a 64,922-bp—core GI-pheV). (C) shiF_iucABCD_iutA_sat module. (D) nan module. (E) flu module. (F) yee module.
Fig 5
Fig 5. Nucleotide conservation and selection pressures within intP4.1_GI-pheV.
The average dN/dS ratio is indicated with a triangle (value range on the left y-axis). Nucleotide similarity, compared to the consensus sequences, is shown with box plots (value range on the right y-axis), with the middle line showing the group median and the coloured dots as the group outliers (defined as any value outside 1.5 times the interquartile range below the first quartile).
Fig 6
Fig 6. The prevalence of ST131_GI-pheV in ST131.
The gene content of EC958_GI-pheV is shown along the x-axis, with the annotation on top (drawn to scale). Strain identifiers are on the y-axis together with the ST131 phylogenetic tree (SNP based), coloured according to clade designations: red, clade A; orange, clade B; pink, B-C intermediate clade; green, clade C. The presence of ST131_GI-pheV in ST131 was identified by BLASTn analysis of ST131 genome assemblies using EC958_GI-pheV as a reference and a cut-off at 95% nucleotide conservation. The BLASTn data was visualized with SeqFindr. The presence of GI-pheV and the type of intP4 and attR are indicated on the right.
Fig 7
Fig 7. Evolutionary relationships of ST131_GI-pheV and their gene modules.
Whole GI-pheV or gene modules were aligned using Mauve or ClustalO, respectively. Phylogenetic trees were constructed using RAxML, with 1,000 bootstraps. Highly supported branches are shown as asterisks for a bootstrapped value > 75%, with scales indicating the number of substitution SNPs. Branches are coloured according to the ST131 clades to which the majority of strains belong: red, clade A; orange, clade B; pink, B-C intermediate clade; green, clade C. Individual strains that cluster with strains from different clades are indicated by filled circles or strain names coloured with their corresponding clade. (A) EC958_GI-pheV. (B) shiF_iucABCD_iutA_sat module. (C) nan module. (D) flu module. (E) yee module.
Fig 8
Fig 8. Relationship between intP4 in ST131 and in E. coli complete genomes.
A phylogenetic tree of ST131_intP4 together with 40 intP4 from 66 complete E. coli genomes examined in this study was constructed using RaxML, with 1,000 bootstraps. The majority of the ST131_intP4 clustered in the intP4.1 allele, together with other UPEC strains. ST131 strains are indicated by coloured dots in accordance with their clade designation: red, clade A; orange, clade B; pink, B-C intermediate clade; green, clade C. Highly supported branches are marked by an asterisk (bootstrapped value >75%).
Fig 9
Fig 9. Genomic comparison among GI-pheV with different gene content in ST131.
Various gene content of GI-pheV visualized based on the alignment of the entire GI using Easyfig. The black to grey gradient between the GI indicates their pairwise nucleotide conservation. Conserved genes and gene modules are coloured as indicated in the text. Strain names are coloured in accordance with their ST131 clade designation: red, clade A; orange, clade B; green, clade C.

Similar articles

References

    1. Kaper JB, Nataro JP, Mobley HL. Pathogenic Escherichia coli. Nat Rev Microbiol. 2004;2(2):123–40. doi: 10.1038/nrmicro818 . - DOI - PubMed
    1. Russo TA, Johnson JR. Proposal for a new inclusive designation for extraintestinal pathogenic isolates of Escherichia coli: ExPEC. J Infect Dis. 2000;181(5):1753–4. doi: 10.1086/315418 . - DOI - PubMed
    1. Rasko DA, Rosovitz MJ, Myers GS, Mongodin EF, Fricke WF, Gajer P, et al.. The pangenome structure of Escherichia coli: comparative genomic analysis of E. coli commensal and pathogenic isolates. J Bacteriol. 2008;190(20):6881–93. doi: 10.1128/JB.00619-08 . - DOI - PMC - PubMed
    1. Moriel DG, Tan L, Goh KG, Phan MD, Ipe DS, Lo AW, et al.. A Novel Protective Vaccine Antigen from the Core Escherichia coli Genome. mSphere. 2016;1(6). doi: 10.1128/mSphere.00326-16 . - DOI - PMC - PubMed
    1. Lloyd AL, Rasko DA, Mobley HL. Defining genomic islands and uropathogen-specific genes in uropathogenic Escherichia coli. J Bacteriol. 2007;189(9):3532–46. doi: 10.1128/JB.01744-06 . - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources