Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 5;12(12):e0364323.
doi: 10.1128/spectrum.03643-23. Online ahead of print.

Evaluating methods for identifying and quantifying Streptococcus pneumoniae co-colonization using next-generation sequencing data

Affiliations

Evaluating methods for identifying and quantifying Streptococcus pneumoniae co-colonization using next-generation sequencing data

Jada Hackman et al. Microbiol Spectr. .

Abstract

Detection of multiple pneumococcal serotype carriage can enhance monitoring of pneumococcal vaccine impact, particularly among high-burden childhood populations. We assessed methods for identifying co-carriage of pneumococcal serotypes from whole-genome sequences. Twenty-four nasopharyngeal samples were collected during community carriage surveillance from healthy children in Blantyre, Malawi, which were then serotyped by microarray. Pneumococcal DNA from culture plate sweeps were sequenced using Illumina MiSeq, and genomic serotyping was carried out using SeroCall and PneumoKITy. Their sensitivity was calculated in reference to the microarray data. Local maxima in the single-nucleotide polymorphism (SNP) density distributions were assessed for their correspondence to the relative abundance of serotypes. Across the 24 individuals, the microarray detected 77 non-unique serotypes, of which 42 occurred at high relative abundance (>10%) (per individual, median, 3; range, 1-6 serotypes). The average sequencing depth was 57X (range: 21X-88X). The sensitivity of SeroCall for identifying high-abundance serotypes was 98% (95% CI, 0.87-1.00), 20% (0.08-0.36) for low abundance (<10%), and 62% (0.50-0.72) overall. PneumoKITy's sensitivity was 86% (0.72-0.95), 20% (0.06-0.32), and 56% (0.42-0.65), respectively. Local maxima in the SNP frequency distribution were highly correlated with the relative abundance of high-abundance serotypes. Six samples were resequenced, and the pooled runs had an average fourfold increase in sequencing depth. This allowed genomic serotyping of two of the previously undetectable seven low-abundance serotypes. Genomic serotyping is highly sensitive for the detection of high-abundance serotypes in samples with co-carriage. Serotype-associated reads may be identified through SNP frequency, and increased read depth can increase sensitivity for low-abundance serotype detection.IMPORTANCEPneumococcal carriage is a prerequisite for invasive pneumococcal disease, which is a leading cause of childhood pneumonia. Multiple carriage of unique pneumococcal serotypes at a single time point is prevalent among high-burden childhood populations. This study assessed the sensitivity of different genomic serotyping methods for identifying pneumococcal serotypes during co-carriage. These methods were evaluated against the current gold standard for co-carriage detection. The results showed that genomic serotyping methods have high sensitivity for detecting high-abundance serotypes in samples with co-carriage, and increasing sequencing depth can increase sensitivity for low-abundance serotypes. These results are important for monitoring vaccine impact, which aims to reduce the prevalence of specific pneumococcal serotypes. By accurately detecting and identifying multiple pneumococcal serotypes in carrier populations, we can better evaluate the effectiveness of vaccination programs.

Keywords: Africa; Streptococcus pneumoniae; co-carriage; microarray; pneumococcus; sequencing; serotyping.

PubMed Disclaimer

Conflict of interest statement

Jason Hinds is involved in studies at St George's, University of London, or BUGS Bioscience that are sponsored by vaccine manufacturers, including Pfizer, GlaxoSmithKline, and Sanofi Pasteur. He is also a co-founder and shareholder of BUGS Bioscience, a not-for-profit biotech company in charge of microarray results for this study out of St George's, University of London. No other authors declare competing interests.

Figures

Fig 1
Fig 1
(A) Top: relative abundance of 77 serotypes detected by microarray (x-axis) and their relative abundances observed by SeroCall (y-axis). The distance from the diagonal line represents the extent of discordance; points below the diagonal line are samples with higher relative abundance by microarray, and points above are samples with lower relative abundance by microarray. (A) Bottom: SeroCall sensitivity (%) to identify serotypes detected by microarray, regardless of their relative abundance that SeroCall observed, with the light gray line representing a 95% binomial confidence interval. (B) Same as (A) but using PneumoKITy as the genomic serotyping method.
Fig 2
Fig 2
(A) Examples of density plots of SNP (left) and frequency plots of SNPs in reference to KK0981 whole genome (right), where a single point is a mutation, and the position along the y-axis is the frequency of the mutation relative to the reference genome. The green dotted lines represent local maxima in the SNP density distribution as identified by visual inspection. S02 shows evidence that the individual was infected with a single haplotype, with the widest SNP frequency band positioned near 100%. S03 shows evidence to support that the individual is infected with two haplotypes present at 20% and 80% frequencies. S04 is an example where there is evidence that there is probably a single population; however, there are some signals represented by the small local maxima indicating potential unobserved minor variants. S05 is an example of clear co-carriage; however, it is difficult to distinguish the local maxima. The red box, in the density plot, highlights the threshold (<0.3) that was set to minimize potential artifacts due to sequencing error. The blue box, in the frequency plot, highlights the SNPs that occur at a frequency of 100%, which are SNPs that are present in both the sample and reference genome. (B) Top: relative abundance of 77 serotypes detected by microarray (x-axis) and likely corresponding local maxima in the SNP density distribution (y-axis). Bottom: observed local maxima sensitivity (%) to identify serotypes detected by microarray. The light gray line represents a 95% binomial confidence interval.

References

    1. Watson DA, Musher DM, Verhoef J. 1995. Pneumococcal virulence factors and host immune responses to them. Eur J Clin Microbiol Infect Dis 14:479–490. doi:10.1007/BF02113425 - DOI - PubMed
    1. Bentley SD, Aanensen DM, Mavroidi A, Saunders D, Rabbinowitsch E, Collins M, Donohoe K, Harris D, Murphy L, Quail MA, Samuel G, Skovsted IC, Kaltoft MS, Barrell B, Reeves PR, Parkhill J, Spratt BG. 2006. Genetic analysis of the capsular biosynthetic locus from all 90 pneumococcal serotypes. PLoS Genet 2:e31. doi:10.1371/journal.pgen.0020031 - DOI - PMC - PubMed
    1. Mavroidi A, Aanensen DM, Godoy D, Skovsted IC, Kaltoft MS, Reeves PR, Bentley SD, Spratt BG. 2007. Genetic relatedness of the Streptococcus pneumoniae capsular biosynthetic loci. J Bacteriol 189:7841–7855. doi:10.1128/JB.00836-07 - DOI - PMC - PubMed
    1. Ganaie F, Saad JS, McGee L, van Tonder AJ, Bentley SD, Lo SW, Gladstone RA, Turner P, Keenan JD, Breiman RF, Nahm MH. 2020. A new pneumococcal capsule type, 10D, is the 100th serotype and has a large cps fragment from an oral Streptococcus. mBio 11:e00937-20. doi:10.1128/mBio.00937-20 - DOI - PMC - PubMed
    1. Park IH, Kim K-H, Andrade AL, Briles DE, McDaniel LS, Nahm MH. 2012. Nontypeable pneumococci can be divided into multiple cps types, including one type expressing the novel gene pspK. mBio 3:e00035-12. doi:10.1128/mBio.00035-12 - DOI - PMC - PubMed

LinkOut - more resources