Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec;1(12):1950-1960.
doi: 10.1038/s41559-017-0337-x. Epub 2017 Oct 16.

Frequency-dependent selection in vaccine-associated pneumococcal population dynamics

Affiliations

Frequency-dependent selection in vaccine-associated pneumococcal population dynamics

Jukka Corander et al. Nat Ecol Evol. 2017 Dec.

Abstract

Many bacterial species are composed of multiple lineages distinguished by extensive variation in gene content. These often cocirculate in the same habitat, but the evolutionary and ecological processes that shape these complex populations are poorly understood. Addressing these questions is particularly important for Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen, because the changes in population structure associated with the recent introduction of partial-coverage vaccines have substantially reduced pneumococcal disease. Here we show that pneumococcal lineages from multiple populations each have a distinct combination of intermediate-frequency genes. Functional analysis suggested that these loci may be subject to negative frequency-dependent selection (NFDS) through interactions with other bacteria, hosts or mobile elements. Correspondingly, these genes had similar frequencies in four populations with dissimilar lineage compositions. These frequencies were maintained following substantial alterations in lineage prevalences once vaccination programmes began. Fitting a multilocus NFDS model of post-vaccine population dynamics to three genomic datasets using Approximate Bayesian Computation generated reproducible estimates of the influence of NFDS on pneumococcal evolution, the strength of which varied between loci. Simulations replicated the stable frequency of lineages unperturbed by vaccination, patterns of serotype switching and clonal replacement. This framework highlights how bacterial ecology affects the impact of clinical interventions.

PubMed Disclaimer

Conflict of interest statement

Competing financial interests

ML has consulted for Pfizer, Affinivax and Merck and grant support not related to this paper from Pfizer and PATH Vaccine Solutions. WPH, ML and NJC have consulted for Antigen Discovery Inc.

Figures

Figure 1
Figure 1
Diversity and structure of the pneumococcal population. a Functional classification of the 1,112 intermediate-frequency and 1,194 core COGs in the Massachusetts pneumococcal population, as detailed in Supplementary Datasets 1 and 2. Each barchart compares the frequencies of functional categories in intermediate-frequency and core COGs. Categories are grouped as likely to be under NFDS resulting from bacterium-MGE interactions (pink segments), bacterium-bacterium interactions (blue segments), or bacterium-host interactions (green segments). The chart with orange segments shows the frequencies of loci with roles in general metabolism or signal transduction, or otherwise could not be classified. b Population structure of the 4,127 isolates from Massachusetts (Mass), Southampton (Soton), Nijmegen and Maela (Supplementary Dataset 3). The maximum likelihood phylogeny was generated from 1,447 core gCOGs. The adjacent columns contain a row for each genome, which represent the population in which the bacterium was isolated, its susceptibility to PCV7-induced immunity, and sequence cluster classification. c Comparison of core genome divergence, quantified as the cophenetic distance between isolates in the core genome phylogeny, and the accessory genome divergence, quantified as the Jaccard distance between the gCOG content of isolates. Each point is a pairwise comparison between randomly sampled isolates (excluding the polyphyletic SC0), which was coloured orange if the isolates belonged to the same sequence cluster; purple if they belonged to different sequence clusters but were both encapsulated; or otherwise dark blue, revealing the presence of some genetically divergent unencapsulated genotypes. Isocontour lines quantify the distribution of points in each category.
Figure 2
Figure 2
Distribution of genetic diversity between populations. Column (a) compares the distribution of sequence clusters between populations; the frequency of each sequence cluster in Massachusetts is shown on the horizontal axis, and the corresponding frequencies in Maela, Southampton and Nijmegen are shown on the vertical axes in the plots from top to bottom. Red points correspond to predominantly VT (≥75%) sequence clusters; blue points to predominantly NVT (≥75%) sequence clusters, and black points to mixed sequence clusters. Column (b) compares the distribution of gCOGs between populations. The frequency of each in Massachusetts is shown on the horizontal axis, and the corresponding frequencies in Maela, Southampton and Nijmegen are shown on the vertical axes. Only gCOGs present at a mean frequency between 5% and 95% across the two compared populations were included, and the corresponding points are coloured according to the functional annotation of COGs in Fig 1a. The elevated frequencies of gCOGs encoded by Tn916, including the tetM tetracycline resistance gene, in Maela are annotated. Column (c) compares the pre- and post-vaccination frequencies of sequence clusters in Massachusetts, Southampton and Nijmegen. Points are coloured as in (a), showing the general decline in the frequency of VT sequence clusters. Column (d) compares the pre- and post-vaccination frequency of gCOGs in Massachusetts, Southampton and Nijmegen. Only gCOGs with an overall frequency between 5% and 95% in the relevant population were included in the panels. Points are coloured as in (b). The reduced frequency of the wciN allele involved in synthesis of the VT 6A and 6B capsules is annotated. As the relationships between gCOG frequencies were linear, each panel displays Pearson's correlation statistics, including two-sided p values.
Fig. 3
Fig. 3
Comparing the sampled and simulated pneumococcal populations. In each barplot, the bacterial population is split into sequence clusters by vertical black lines, annotated at the top of the graph. Each sequence cluster is split into three timepoints: pre-vaccination, a midpoint sample and a late sample. Only sequence clusters present at greater than 2.5% frequency at one of these timepoints in the genomic sample are included in the graphs; full results are shown in the supplementary materials. The bars at each timepoint are split into red segments, for VT isolates, and blue segments, for NVT isolates. In each comparison, the top row is the genomic sample against which simulations were evaluated. The bottom row summarises the output of 100 simulations using the heterogeneous rate multilocus NFDS model performed using the point estimate parameter values from Table 1. At the times at which samples were present in the respective genomic collections, the same numbers of isolates were randomly selected from the simulated populations. The bars represent the median result, and the error bars (orange for VT isolates, and purple for NVT isolates) represent the interquartile range observed across the simulations. (a) The results for Massachusetts split isolates into pre-vaccination (2001; 133 isolates), midpoint (2004; 203 isolates) and late (2007; 280 isolates) samples. (b) The results for Southampton, splitting isolates into pre-vaccination (up to 2007; 100 isolates), midpoint (2008-2009; 194 isolates) and late (2010-2011; 195 isolates) samples. (c) The results for Nijmegen, splitting isolates into pre-vaccination (up to 2007; 209 isolates), midpoint (2008-2009; 80 isolates) and late (2010-2011; 48 isolates) samples.

Comment in

  • The great escape.
    Heinz E. Heinz E. Nat Rev Microbiol. 2017 Dec 8;16(1):4. doi: 10.1038/nrmicro.2017.156. Nat Rev Microbiol. 2017. PMID: 29217841 No abstract available.

References

    1. Haegeman B, Weitz JS. A neutral theory of genome evolution and the frequency distribution of genes. BMC Genomics. 2012;13:196. - PMC - PubMed
    1. Baumdicker F, Hess WR, Pfaffelhuber P. The infinitely many genes model for the distributed genome of bacteria. Genome Biology and Evolution. 2012;4:443–456. - PMC - PubMed
    1. Marttinen P, Croucher NJ, Gutmann MU, Corander J, Hanage WP. Recombination produces coherent bacterial species clusters in both core and accessory genomes. Microb Genomics. 2015;1 doi: 10.1099/mgen.0.000038. - DOI - PMC - PubMed
    1. Hogg JS, et al. Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains. Genome Biol. 2007;8:R103. - PMC - PubMed
    1. Collins RE, Higgs PG. Testing the infinitely many genes model for the evolution of the bacterial core genome and pangenome. Mol Biol Evol. 2012;29:3413–3425. - PubMed

MeSH terms