Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Feb 9;14(1):13.
doi: 10.1186/s13073-022-01017-x.

Whole genome sequencing-based classification of human-related Haemophilus species and detection of antimicrobial resistance genes

Affiliations

Whole genome sequencing-based classification of human-related Haemophilus species and detection of antimicrobial resistance genes

Margo Diricks et al. Genome Med. .

Erratum in

Abstract

Background: Bacteria belonging to the genus Haemophilus cause a wide range of diseases in humans. Recently, H. influenzae was classified by the WHO as priority pathogen due to the wide spread of ampicillin resistant strains. However, other Haemophilus spp. are often misclassified as H. influenzae. Therefore, we established an accurate and rapid whole genome sequencing (WGS) based classification and serotyping algorithm and combined it with the detection of resistance genes.

Methods: A gene presence/absence-based classification algorithm was developed, which employs the open-source gene-detection tool SRST2 and a new classification database comprising 36 genes, including capsule loci for serotyping. These genes were identified using a comparative genome analysis of 215 strains belonging to ten human-related Haemophilus (sub)species (training dataset). The algorithm was evaluated on 1329 public short read datasets (evaluation dataset) and used to reclassify 262 clinical Haemophilus spp. isolates from 250 patients (German cohort). In addition, the presence of antibiotic resistance genes within the German dataset was evaluated with SRST2 and correlated with results of traditional phenotyping assays.

Results: The newly developed algorithm can differentiate between clinically relevant Haemophilus species including, but not limited to, H. influenzae, H. haemolyticus, and H. parainfluenzae. It can also identify putative haemin-independent H. haemolyticus strains and determine the serotype of typeable Haemophilus strains. The algorithm performed excellently in the evaluation dataset (99.6% concordance with reported species classification and 99.5% with reported serotype) and revealed several misclassifications. Additionally, 83 out of 262 (31.7%) suspected H. influenzae strains from the German cohort were in fact H. haemolyticus strains, some of which associated with mouth abscesses and lower respiratory tract infections. Resistance genes were detected in 16 out of 262 datasets from the German cohort. Prediction of ampicillin resistance, associated with blaTEM-1D, and tetracycline resistance, associated with tetB, correlated well with available phenotypic data.

Conclusions: Our new classification database and algorithm have the potential to improve diagnosis and surveillance of Haemophilus spp. and can easily be coupled with other public genotyping and antimicrobial resistance databases. Our data also point towards a possible pathogenic role of H. haemolyticus strains, which needs to be further investigated.

Keywords: Antibiotic resistance; H. haemolyticus; H. influenzae; Haemophilus; Identification; Molecular differentiation; Pangenome-wide association study; Precision medicine; Whole genome sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Presence and absence of marker genes in the training dataset. The phylogenetic tree is based on the alignment of 455 core genes (present in at least 90% of the strains) inferred from 215 whole genome sequencing datasets of human-related Haemophilus spp. A Presence/absence of marker genes that specifically discriminate between H. haemolyticus and H. influenzae. B Presence/Absence of haemin biosynthesis genes (hem*), which are colored according to the species identity of the reference alleles for which a valid hit was found. C Presence/absence of lacZ, a β-galactosidase gene that differentiates between H. parahaemolyticus and H. paraphrohaemolyticus. D Presence/absence of nadV, which is related to the H. ducreyi characteristic V factor independency. E Presence/absence of Region I (bex*), region II and region III (hcs*) capsule loci (in silico serotyping). All annotated Haemophilus spp. clades were separated with a strong local support value (100%)
Fig. 2
Fig. 2
Decision algorithm to classify human-related strains of Haemophilus spp. based on whole genome sequencing data. The number next to the arrow specifies the minimum number of marker genes that needs to be detected before a (sub)species tag is attributed to the strain
Fig. 3
Fig. 3
Phylogeny of 262 clinical Haemophilus spp. isolates from a German cohort. The phylogenetic tree is based on the alignment of 104 core genes (present in at least 90% of the strains). A Kraken2 read classification output. The length of a bar is proportional to the percentage of reads that are assigned to the respective taxon (as indicated by the color). One H. influenzae culture (located in the phylogenetic tree in the “fuzzy” clade) was likely contaminated with a Streptococcus sp. strain (19% of the reads assigned to this species) and another one with an Aggregatibacter sp. strain (52% reads assigned to this species). B Presence/absence of marker genes included in our new taxonomic classification database. C Final classification output of the decision algorithm. Mixed colors represent the presence of multiple full marker patterns, indicating multiple distinct Haemophilus species. D Presence/absence of antibiotic resistance genes included in a public resistance database. Color codes correlate to the antibiotic class to which the gene confers resistance: aminoglycosides (Agly), β-lactam antibiotics (Bla), phenicols (Phe), trimethoprim (Tmt), macrolide-lincosamide-streptogramin (MLS), sulfonamides (Sul), and tetracyclines (Tet)

Similar articles

Cited by

References

    1. Winslow C, Broadhurst J, Buchanan R, Krumwiede C, Rogers L, Smith G. The Families and Genera of the Bacteria: Preliminary Report of the Committee of the Society of American Bacteriologists on Characterization and Classification of Bacterial Types. J Bacteriol. 1917;2(5):505–566. - PMC - PubMed
    1. Thjötta T, Avery OT. Studies on bacterial nutrition : II. Growth accessory substances in the cultivation of Hemophilic bacilli. J Exp Med. 1921;34(1):97. - PMC - PubMed
    1. Nørskov-Lauritsen N. Classification, identification, and clinical significance of Haemophilus and Aggregatibacter species with host specificity for humans. Clin Microbiol Rev. 2014;27(2):214–240. - PMC - PubMed
    1. Mukundan D, Ecevit Z, Patel M, Marrs CF, Gilsdorf JR. Pharyngeal colonization dynamics of Haemophilus influenzae and Haemophilus haemolyticus in healthy adult carriers. J Clin Microbiol. 2007;45(10):3207–3217. - PMC - PubMed
    1. Van Eldere J, Slack MPE, Ladhani S, Cripps AW. Non-typeable Haemophilus influenzae, an under-recognised pathogen. Lancet Infect Dis. 2014;14:1281–1292. - PubMed

Publication types

MeSH terms

Substances