Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jul 20:6:30115.
doi: 10.1038/srep30115.

Genomic content typifying a prevalent clade of bovine mastitis-associated Escherichia coli

Affiliations

Genomic content typifying a prevalent clade of bovine mastitis-associated Escherichia coli

Robert J Goldstone et al. Sci Rep. .

Erratum in

Abstract

E. coli represents a heterogeneous population with capabilities to cause disease in several anatomical sites. Among sites that can be colonised is the bovine mammary gland (udder) and a distinct class of mammary pathogenic E. coli (MPEC) has been proposed. MPEC are the principle causative agents of bovine mastitis in well-managed dairy farms, costing producers in the European Union an estimated €2 billion per year. Despite the economic impact, and the threat this disease presents to small and medium sized dairy farmers, the factors which mediate the ability for E. coli to thrive in bovine mammary tissue remain poorly elucidated. Strains belonging to E. coli phylogroup A are most frequently isolated from mastitis. In this paper, we apply a population level genomic analysis to this group of E. coli to uncover genomic signatures of mammary infectivity. Through a robust statistical analysis, we show that not all strains of E. coli are equally likely to cause mastitis, and those that do possess specific gene content that may promote their adaptation and survival in the bovine udder. Through a pan-genomic analysis, we identify just three genetic loci which are ubiquitous in MPEC, but appear dispensable for E. coli from other niches.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Position of 66 mastitis-associated E. coli isolates within phylogroup A.
A maximum likelihood tree constructed from the concatenated sequence of 159 core E. coli genes elaborates the known population structure of E. coli. Using this tree, we positioned the 66 MPEC isolate within phylogroup A (grey bars). Branches are coloured according to phylogroup: (A) blue; (B1), green; (B2), red; (C) magenta; (D) brown; (E) cyan; F purple; Shigella; gold. Shigella genomes which fall into other phylogroups are not coloured.
Figure 2
Figure 2. Mastitis isolates are more closely related to each other, on average, than would be expected by chance.
Panel A shows a phylogenetic tree for 533 phylogroup A genomes, constructed from the concatenated sequence of 520 non-recombining genes as estimated by maximum likelihood. For clarity, bootstrap values have been removed. Labels are coloured according to country of origin (Belgium = brown, Finland = green, France = blue, Germany = purple, Israel = gold, UK = red). One isolate (ECC-Z) was isolated from the Netherlands, and one was isolated in Denmark. Panel B shows the results of a resampling analysis to investigate the probability that the average phylogenetic distance between MPEC could be generated by randomly placing MPEC genomes onto the phylogroup A phylogenetic tree. The bell curve in the plot represents the kernel density estimate of 100,000 replications, where the average distance between 66 randomly selected genomes is calculated. The red vertical line represents the actual average distance observed between MPEC. The p-value is calculated by how many of the randomised samples display a distance as low as, or lower, than that observed between MPEC. The distance between MPEC genomes is highly significant (p = 0.00015), indicating that only 15 in 100,000 randomised replications had average distances which were as low or lower than that observed between MPEC genomes. The four vertical grey bars represent the location on the distribution that would yield p-values of 0.0001, 0.001, 0.01, and 0.05, respectively.
Figure 3
Figure 3. Within-country MPEC isolates are no more similar than would be expected by chance.
Panel A shows a maximum likelihood tree of the 66 MPEC genomes used in this study, showing their relative positions within phylogroup A. Labels are coloured according to country of origin (Belgium = brown, Finland = purple, France = blue, Germany = gold, Israel = green, UK = red). One isolate (ECC-Z) was isolated from the Netherlands, and one from Denmark. Both these isolates are coloured black and due to the fact that they are the only representatives for their country groups these isolates were excluded from the analysis in this Figure. The countries of origin appear well mixed throughout the phylogenetic tree. Informative bootstrap values are given as integers adjacent to bifurcations. Panel B shows density estimates for the average phylogenetic distance observed between 10,000 randomised samples of the same number of genomes as isolates from each country (n, given alongside the country name for each plot), alongside a red vertical line which denotes the actual average distance between the E. coli genomes from each country. For each country, the average distance observed between the strains is no different than could be generated by a random process. The four grey vertical lines going right of the leading edge of the density plots represent p values of 0.0001, 0.001, 0.01, and 0.05, respectively.
Figure 4
Figure 4. The core genome and pan genome size of the strains investigated.
Panel A shows a curve for the core genome (genes present in at least n-1 strains) of phylogroup A E. coli (blue) and MPEC (red) when n number of strains are sampled from the populations, over 10,000 replications per data point. Polygons represent the standard deviation at each data point. This data shows that MPEC have a larger core genome than typical of phylogroup A. Panel B shows a curve for the pan-genome (genes present in at least one strain) for phylogroup A (blue) or MPEC (red) strains, when n number of genomes are sampled from the population, over 10,000 replications per data point. Polygons represent the standard deviation at each data point. These data shows that MPEC have a smaller pan-genome than phylogroup A E. coli.
Figure 5
Figure 5. The carriage of the genes surrounding ymdE and ycdU.
These data shows that several genes between pgaD and ycdU are more abundant in MPEC genomes from phylogroup A than they are in the wider phylogroup A population. These genes comprise what appears to be a genome island, flanked by the core genes efeB/phoH, and ghrA/ycdXYZ. Although pgaB and pgaA, along with ymdE and ycdU, are present in less than 446 of all phylogroup A (blue open circles), only ymdE and ycdU are also present in at least 65/66 MPEC genomes (red filled circles), qualifying these as MPEC-specific core.
Figure 6
Figure 6. Carriage of the paa region in MPEC compared with phylogroup A E. coli.
The core MPEC genes are coloured green, whilst a region of the paa locus which has been deleted in several MPEC is coloured yellow. The ybdA gene, which in MG1655 is a pseudogene, is outlined in magenta - the carriage for this gene is for the full length composite sequence from the MG1655 genome rather than for each half separately.
Figure 7
Figure 7. The carriage of the fec locus in MPEC and phylogroup A E. coli.
This plot shows the genomic context of the fecIRABCDE genes in the genome of MG1655 and the percent carriage of each gene in the phylogroup A population (blue open circles) versus the MPEC population (red closed circles). The seven genes which form part of the specific MPEC core genome are coloured green. These genes (fecIRABCDE) confer the ability for the bacteria to utilise ferric citrate as a source of iron for growth. These genes are found in only 68% of all phylogroup A genomes, but are found in all of the 66 MPEC genomes we investigated. The genes flanking the fec locus show differing levels of carriage, which tend to be lower than that observed for the fec locus itself. This suggests that the genomic context of fec is different in different strains.

Similar articles

Cited by

References

    1. Sousa C. P. The versatile strategies of Escherichia coli pathotypes: a mini review. J. Venom. Anim. Toxins incl. Trop. Dis. 12, 363–373 (2006).
    1. Johnson J. R. & Russo T. A. Extraintestinal pathogenic Escherichia coli: “The other bad E. coli ”. J. Lab. Clin. Med. 139, 155–162 (2002). - PubMed
    1. Kaper J. B., Nataro J. P. & Mobley H. L. T. Pathogenic Escherichia coli. Nat. Rev. Micro. 2, 123–140 (2004). - PubMed
    1. Clermont O., Christenson J. K., Denamur E. & Gordon D. M. The Clermont Escherichia coli phylo-typing method revisited: improvement of specificity and detection of new phylo-groups. Environ. Microbiol. Rep. 5, 58–65 (2013). - PubMed
    1. Sims G. E. & Kim S.-H. Whole-genome phylogeny of Escherichia coli/Shigella group by feature frequency profiles (FFPs). Proc. Natl. Acad. Sci. USA 108, 8329–8334 (2011). - PMC - PubMed

Publication types

Substances