Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;5(11):2109-23.
doi: 10.1093/gbe/evt159.

Phylogenetic diversity of the enteric pathogen Salmonella enterica subsp. enterica inferred from genome-wide reference-free SNP characters

Affiliations

Phylogenetic diversity of the enteric pathogen Salmonella enterica subsp. enterica inferred from genome-wide reference-free SNP characters

Ruth E Timme et al. Genome Biol Evol. 2013.

Abstract

The enteric pathogen Salmonella enterica is one of the leading causes of foodborne illness in the world. The species is extremely diverse, containing more than 2,500 named serovars that are designated for their unique antigen characters and pathogenicity profiles-some are known to be virulent pathogens, while others are not. Questions regarding the evolution of pathogenicity, significance of antigen characters, diversity of clustered regularly interspaced short palindromic repeat (CRISPR) loci, among others, will remain elusive until a strong evolutionary framework is established. We present the first large-scale S. enterica subsp. enterica phylogeny inferred from a new reference-free k-mer approach of gathering single nucleotide polymorphisms (SNPs) from whole genomes. The phylogeny of 156 isolates representing 78 serovars (102 were newly sequenced) reveals two major lineages, each with many strongly supported sublineages. One of these lineages is the S. Typhi group; well nested within the phylogeny. Lineage-through-time analyses suggest there have been two instances of accelerated rates of diversification within the subspecies. We also found that antigen characters and CRISPR loci reveal different evolutionary patterns than that of the phylogeny, suggesting that a horizontal gene transfer or possibly a shared environmental acquisition might have influenced the present character distribution. Our study also shows the ability to extract reference-free SNPs from a large set of genomes and then to use these SNPs for phylogenetic reconstruction. This automated, annotation-free approach is an important step forward for bacterial disease tracking and in efficiently elucidating the evolutionary history of highly clonal organisms.

Keywords: CRISPR; H antigens; O antigens; comparative method; lineage-through-time plot; serovar.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.—
Fig. 1.—
Phylogenetic tree based on the maximum-likelihood method implemented in RAxML. Bold black branches represent 90–100% bootstrap support. Bold gray branches represent 70–90% bootstrap support. Numbers associated with branches are SNPs unique to that lineage. For the purposes of this figure, the long outgroup branches were shortened; however, the original tree is available for download at TreeBase.org. Three antigen characters are mapped onto this phylogeny: O group, Phase 1 (H) flagellar antigen, and Phase 2 (H) flagellar antigen.
F<sc>ig</sc>. 2.—
Fig. 2.—
Lineage-through-time plots illustrating fluctuations in diversification rate throughout the evolutionary history of S. enterica.
F<sc>ig</sc>. 3.—
Fig. 3.—
Bayesian clustering results for values of k = 2–5 based on the matrix containing SNPs present in at least 95% of the samples (outgroups were excluded). Different colors represent different clusters and the bars represent different individuals. The extent to which different colors comprise a bar is indicative of the degree of admixture. Samples are in the same order as they are in the ML phylogeny (fig. 1), which is shown for comparison.
F<sc>ig</sc>. 4.—
Fig. 4.—
ML phylogeny from figure 1, pruned for strains for which we have CRISPR data (126 in-house collected draft genomes plus published complete genomes). The four most ancestral spacers were extracted from the CRISPR alignment in supplementary data S4, Supplementary Material online and mapped onto the tree. Spacers with the same coloring represent the exact same underlying sequence. Different coloring represents different underlying sequence. Blue bars represent CRISPR length, which was determined from the number of unaligned spacers for each CRISPR locus. Spacer deletions are represented by a black square with an x.

References

    1. Achtman M, et al. Multilocus sequence typing as a replacement for serotyping in Salmonella enterica. PLoS Pathog. 2012;8:e1002776. - PMC - PubMed
    1. Allard MW, et al. On the evolutionary history, population genetics and diversity among isolates of Salmonella Enteritidis PFGE Pattern JEGX01.0004. PLoS One. 2013;8:e55254. - PMC - PubMed
    1. Barker RM, et al. Types of Salmonella Paratyphi B and their phylogenetic significance. J Med Microbiol. 1988;26:285–293. - PubMed
    1. Barrangou R, Horvath P. CRISPR: new horizons in phage resistance and strain identification. Annu Rev Food Sci Technol. 2012;3:143–162. - PubMed
    1. Baum D, Shaw KL. Genealogical perspectives on the species problem. In: Hoch PC, Stephenson AC, editors. Experimental and molecular approaches to plant biosystematics. St. Louis (MO): Missouri Botanical Garden; 1995.

Publication types

MeSH terms