Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Dec 7:6:38648.
doi: 10.1038/srep38648.

Assessment of virulence potential of uncharacterized Enterococcus faecalis strains using pan genomic approach - Identification of pathogen-specific and habitat-specific genes

Affiliations

Assessment of virulence potential of uncharacterized Enterococcus faecalis strains using pan genomic approach - Identification of pathogen-specific and habitat-specific genes

Utpal Bakshi et al. Sci Rep. .

Abstract

Enterococcus faecalis, a leading nosocomial pathogen and yet a prominent member of gut microbiome, lacks clear demarcation between pathogenic and non-pathogenic strains at genome level. Here we present the comparative genome analysis of 36 E. faecalis strains with different pathogenic features and from different body-habitats. This study begins by addressing the genome dynamics, which shows that the pan-genome of E. faecalis is still open, though the core genome is nearly saturated. We identified eight uncharacterized strains as potential pathogens on the basis of their co-segregation with reported pathogens in gene presence-absence matrix and Pathogenicity Island (PAI) distribution. A ~7.4 kb genomic-cassette, which is itself a part of PAI, is found to exist in all reported and potential pathogens, but not in commensals and other uncharacterized strains. This region encodes four genes and among them, products of two hypothetical genes are predicted to be intrinsically disordered that may serve as novel targets for therapeutic measures. Exclusive existence of 215, 129, 4 and 1 genes in the blood, gastrointestinal tract, urogenital tract, oral cavity and lymph node derived E. faecalis genomes respectively suggests possible employment of distinct habitat-specific genetic strategies in the adaptation of E. faecalis in human host.

PubMed Disclaimer

Figures

Figure 1
Figure 1. The Core and Pan-Genomes of E. faecalis.
Number of shared or core genes (green) and total number of genes or pan-genome (blue) curve for 36 different strains of E. faecalis. The upper and lower edges of the boxes indicate the first quartile (25th percentile of the data) and third quartile (75th percentile), respectively, of 1000 random different input orders of the genomes. The central horizontal line indicates the sample median (50th percentile). The central vertical lines extend from each box as far as the data extend, to a distance of at most 1.5 interquartile ranges (i.e., the distance between the first and third quartile values). At 36 sequenced genomes, the core-genome had 2071 genes, whereas the pan-genome had 7131 total genes. For Pan-genome equation, npan is the expected number of genes for a given number of genomes, N is the number of genomes, k and γ are free parameters defined to fit the specific curve. Best fit obtained as variables k and ɣ were determined to be 2,769.13 and 0.26, respectively. Using the Heap’s Law (α = 1 − ɣ), the α value is thus 0.74, which clearly indicates an open Pan-genome. For core genome equation ncore is the average of core gene distributions, N is the number of genomes, k1, τ1, k2, τ2 and Θ are free parameters and the inverse square of the interquartile ranges of core genome distributions as weights. The best fit was obtained for asymptotic core genome size Θ as 2065 ±10 (at 95% level of confidence), which fits well with our calculated value of core genome, i.e., 2071 genes.
Figure 2
Figure 2. Phylogenetic analysis of E. faecalis strains.
(A) Concatenated Core genome phylogeny. Pathogenic (PA) and Commensal (CO) strains are highlighted in red and green fonts, respectively. (B) Pan-genomic tree of binary gene presence/absence data matrix of orthologous gene families. Different color legends are used to separate different strains according to their niches (red: blood, purple: gastrointestinal tract, green: urogenital tract, sky blue: oral and orange: lymph node). Bootstrap support values are also indicated in the tree. Strain types (STs) and Clonal complexes (CCs) determined by in-silico MLST analysis are also shown in the figures.
Figure 3
Figure 3. Distribution of pathogenic island (PAI) genes in E. faecalis strains.
PAI of E. faecalis comprises of 129 genes, which are clubbed into six island modules (module A–F, indicated by red boxes). Presence and/or absence of PAI genes are shown in green and white, respectively. Strains are hierarchically clustered according to their PAI module completion. Different nodes in this cluster are shown in lower alphabets (a–h). PA and CO strains are highlighted in red and green fonts, respectively. Putative PA strains are highlighted in orange fonts.
Figure 4
Figure 4. Identification of Intrinsically Disordered Regions (IDRs) in pathogen specific protein of E. faecalis.
IDRs are identified by sequence comparison with various structural features in DICHOT tool of IDEAL (Intrinsically Disordered proteins with Extensive Annotations and Literature) database. The selected features are i) PDB Blast for 3D structure ii) PFAM domain search for sequence motif iii) SCOP domain search for classified structure and iv) SEG analysis for low complexity region identification. Final order-disorder regions are shown in line diagram along the length of protein.
Figure 5
Figure 5. Specific genes among Blood and GI Tract niches in E. faecalis strains.
(A) 215 and 129 gene families specific to Blood and GI Tract niches are shown in red and purple color, respectively. (B) Comparative functional analysis of Cluster of Orthologous Groups (COGs) frequencies between Blood and GI Tract niches. COG functional categories represented by one letter code are- J: Translation, ribosomal structure and biogenesis, K: Transcription, L: Replication, recombination and repair, D: Cell cycle control, cell division, chromosome partitioning, V: Defense mechanisms, T: Signal transduction mechanisms, M: Cell wall/membrane/envelope biogenesis, N: Cell motility, U: Intracellular trafficking, secretion, and vesicular transport, O: Posttranslational modification, protein turnover, chaperones, C: Energy production and conversion, G: Carbohydrate transport and metabolism, E: Amino acid transport and metabolism, F: Nucleotide transport and metabolism, H: Coenzyme transport and metabolism, I: Lipid transport and metabolism, P: Inorganic ion transport and metabolism, Q: Secondary metabolites biosynthesis, transport and catabolism, R: General function prediction only, S: Function unknown. Frequencies of four major COG groups are also shown in these two niches.

References

    1. Rosenthal V. D. et al.. International nosocomial infection control consortium report, data summary for 2002–2007, issued January 2008. American journal of infection control. 36 (9), 627–637 (2008). - PubMed
    1. Schaberg D. R., Culver D. H. & Gaynes R. P. Major trends in the microbial etiology of nosocomial infection. The American journal of medicine. 91(3), S72–S75 (1991). - PubMed
    1. Dahlen G., Blomqvist S., Almståhl A. & Carlen A. Virulence factors and antibiotic susceptibility in enterococci isolated from oral mucosal and deep infections. Journal of oral microbiology. 4 (2012). - PMC - PubMed
    1. McBride S. M., Fischetti V. A., LeBlanc D. J., Moellering R. C. Jr & Gilmore M. S. Genetic diversity among Enterococcus faecalis. PloS one. 2(7), e582 (2007). - PMC - PubMed
    1. Sharifi Y. et al.. Survey of virulence determinants among vancomycin resistant Enterococcus faecalis and Enterococcus faecium isolated from clinical specimens of hospitalized patients of North west of Iran. The open microbiology journal. 6(1) (2012). - PMC - PubMed

Publication types

Substances

LinkOut - more resources