Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 10;14(4):evac052.
doi: 10.1093/gbe/evac052.

Comparative Genomics of Disease and Carriage Serotype 1 Pneumococci

Affiliations

Comparative Genomics of Disease and Carriage Serotype 1 Pneumococci

Chrispin Chaguza et al. Genome Biol Evol. .

Abstract

The isolation of Streptococcus pneumoniae serotypes in systemic tissues of patients with invasive disease versus the nasopharynx of healthy individuals with asymptomatic carriage varies widely. Some serotypes are hyper-invasive, particularly serotype 1, but the underlying genetics remain poorly understood due to the rarity of carriage isolates, reducing the power of comparison with invasive isolates. Here, we use a well-controlled genome-wide association study to search for genetic variation associated with invasiveness of serotype 1 pneumococci from a serotype 1 endemic setting in Africa. We found no consensus evidence that certain genomic variation is overrepresented among isolates from patients with invasive disease than asymptomatic carriage. Overall, the genomic variation explained negligible phenotypic variability, suggesting a minimal effect on the disease status. Furthermore, changes in lineage distribution were seen with lineages replacing each other over time, highlighting the importance of continued pathogen surveillance. Our findings suggest that the hyper-invasiveness is an intrinsic property of the serotype 1 strains, not specific for a "disease-associated" subpopulation disproportionately harboring unique genomic variation.

Keywords: Streptococcus pneumoniae; bacterial genomics; genome-wide association study; genomic epidemiology; invasiveness; pathogenicity.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Characteristics and genetic relatedness of the pneumococcal serotype 1 isolates used in the study. (a) Map of Africa showing the location of The Gambia in West Africa where the study was conducted. (b) Disease status as defined based on the anatomical site of the human body showing where the serotype 1 isolates used in the study were sampled from. (c) Bar plot showing the number of whole-genome sequenced serotype 1 isolates from the carriage (n = 65) and disease (n = 139). (d) Distribution of the serotype 1 isolates from carriage and disease by the age of the individuals. (e) Line plot showing the temporal distribution of the serotype 1 isolates from carriage and disease. (f) A maximum-likelihood phylogenetic tree constructed after removing SNPs in recombinogenic regions showing genetic relatedness of the carriage and disease serotype 1 isolates. The icons in (b) shown in the figure were created with permission in BioRender.com (https://biorender.com/).
Fig. 2.
Fig. 2.
Phylogenetic signal and distribution of the pneumococcal serotype 1 isolates based on disease status. (a) Maximum-likelihood phylogeny showing the posterior probability of each disease status states at each terminal and internal node of the tree based on stochastic ancestral character reconstruction. The internal nodes are drawn with a larger radius to distinguish them from the terminal nodes. The colors of the nodes represent the disease status of the isolates as shown in the key next to the phylogenetic tree. (b) Phylogenetic tree of a subset of the serotype 1 isolates belonging to clade IV, which is predominantly associated with ST3081, the most common serotype 1 ST in The Gambia, West Africa. (c) The zoomed-in phylogenetic tree of the isolates belonging to clade I containing isolates belonging to ST618, which was the most dominant serotype 1 ST in The Gambia before its replacement by ST3081 in the early 2000s. (d) Estimated genetic signals associated with disease status of the serotype 1 isolates using the Pagel’s λ statistic. The transition rates between disease states are shown next to the arrows and the values of the Pagel’s λ statistic are shown at the bottom of the diagram for each model.
Fig. 3.
Fig. 3.
Overview of the GWASs performed in this study using different methods and types of genetic variation. Summary of the number of pneumococcal serotype 1 isolates sampled from healthy individuals with asymptomatic carriage and patients with invasive diseases. Three different types of genetic variation, namely, presence/absence of accessory genes, SNPs, and unitigs, were used for the GWAS. Each type of genetic variation was analyzed using multiple approaches, two linear mixed model methods (FaST-LMM and GEMMA) and phylogenetic or evolutionary convergence-based method (Scoary).
Fig. 4.
Fig. 4.
Manhattan and volcano plots showing statistical significance and effect sizes of the genetic variants associated with disease status in the GWAS. Venn diagrams showing the number of statistically significant (a) SNPs, (b) accessory genes, and (c) unitig sequences identified by each GWAS method. The total number of variants tested is specified in the title for ac. The absence data for Scoary in (a) reflect the fact that we did not run GWAS of the SNPs using this method. Manhattan plots showing statistical significance (-log10[unadjusted P-value]) and chromosomal location of the unitig sequences for the GWAS using (d) FaST-LMM, (e) GEMMA, and (f) Scoary. Volcano plot showing the relationship between statistical significance (-log10[unadjusted P-value]) and the effect size in terms of the log-transformed (base 2) asymptomatic carriage-to-disease odds ratio for the accessory gene sequences for the GWAS using (g) FaST-LMM, (h) GEMMA, and (i) Scoary. The points in all the graphs are colored based on the odds ratio, as shown in the key on the right of each diagram. The blue line represents the genome-wide statistical significance threshold based on the Bonferroni adjustment. The unitig sequences shown in gi were mapped to a complete reference genome for serotype 1 strain PNI0373 from The Gambia belonging to sequence type ST168 (GenBank accession: CP001845).

References

    1. Abdullahi O, et al. 2012. Rates of acquisition and clearance of pneumococcal serotypes in the nasopharynges of children in Kilifi District, Kenya. J Infect Dis. 206:1020–1029. - PMC - PubMed
    1. Altschul SF, et al. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402. - PMC - PubMed
    1. Antonio M, et al. 2008. Seasonality and outbreak of a predominant Streptococcus pneumoniae serotype 1 clone from The Gambia: expansion of ST217 hypervirulent clonal complex in West Africa. BMC Microbiol. 8:198. - PMC - PubMed
    1. Aronson BD, Levinthal M, Somerville RL. 1989. Activation of a cryptic pathway for threonine metabolism via specific IS3-mediated alteration of promoter structure in Escherichia coli. J Bacteriol. 171:5503–5511. - PMC - PubMed
    1. Balsells E, et al. 2018. The relative invasive disease potential of Streptococcus pneumoniae among children after PCV introduction: a systematic review and meta-analysis. J Infect. 77:368–378. - PubMed

Publication types

Substances