Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug 23;7(1):1038.
doi: 10.1038/s42003-024-06720-6.

PathoTracker: an online analytical metagenomic platform for Klebsiella pneumoniae feature identification and outbreak alerting

Affiliations

PathoTracker: an online analytical metagenomic platform for Klebsiella pneumoniae feature identification and outbreak alerting

Shuyi Wang et al. Commun Biol. .

Abstract

Clinical metagenomics (CMg) Nanopore sequencing can facilitate infectious disease diagnosis. In China, sub-lineages ST11-KL64 and ST11-KL47 Carbapenem-resistant Klebsiella pneumoniae (CRKP) are widely prevalent. We propose PathoTracker, a specially compiled database and arranged method for strain feature identification in CMg samples and CRKP traceability. A database targeting high-prevalence horizontal gene transfer in CRKP strains and a ST11-only database for distinguishing two sub-lineages in China were created. To make the database user-friendly, facilitate immediate downstream strain feature identification from raw Nanopore metagenomic data, and avoid the need for phylogenetic analysis from scratch, we developed data analysis methods. The methods included pre-performed phylogenetic analysis, gene-isolate-cluster index and multilevel pan-genome database and reduced storage space by 10-fold and random-access memory by 52-fold compared with normal methods. PathoTracker can provide accurate and fast strain-level analysis for CMg data after 1 h Nanopore sequencing, allowing early warning of outbreaks. A user-friendly page ( http://PathoTracker.pku.edu.cn/ ) was developed to facilitate online analysis, including strain-level feature, species identifications and phylogenetic analyses. PathoTracker proposed in this study will aid in the downstream analysis of CMg.

PubMed Disclaimer

Conflict of interest statement

The author declares no competing interests.

Figures

Fig. 1
Fig. 1. PathoTracker study flowchart.
a Database construction. For each strain, antimicrobial susceptibility testing was conducted to obtain minimum inhibitory concentration (MIC) tables, and sequencing was conducted to acquire genomic information. For all assembly genomes, the phylogenetic tree was conducted and divided into clusters. For each cluster, the corresponding pan-genome database and index were constructed. b Comparison process. 1) All reads of CMg were compared with the pan-genome databases of each cluster already constructed in (a); 2) the closest cluster to the sample was selected; 3) the clusters chosen in 2) were screened and isolates closest to the sample in the database were selected. c Validation process. Two validation modes were introduced: 1) 29 CMg samples sequenced using Nanopore, and whole-genome sequencing samples of Klebsiella pneumoniae strains cultured from clinical metagenomic sequencing samples; 2) Bootstrap sampling validation.
Fig. 2
Fig. 2. Phylogenetic trees in the database.
a Phylogenetic tree of 1,187 isolates. Due to the wide variety of ST and KL typing, only the virulence score is shown. For “ST” and “KL” rings, different colours represent different ST or KL type. The cluster when the threshold of Treecluster is set to 0.005 and 0.015 is shown with “0.005” and “0.015” rings. For “0.005”, “0.015” and “cluster” rings, different colours represent different clusters. b Phylogenetic tree of ST11-CRKP isolates. The selected cluster is listed in the “cluster” ring in the phylogenetic tree; different colours represent different clusters. The presence or absence of blaNDM and blaKPC, and the specific region in China where the Chinese ST11 strain originated from were annotated as rings. The colors of KL47 and KL64 in the two sub-panels are the same.
Fig. 3
Fig. 3. Runtime and result validation of PathoTracker.
a The PathoTracker results for CMg samples after 1 h nanopore sequencing data. “Read count” refers to the number of reads detected at 1 h and 72 h after sequencing. The colours in the “Read count” column represent gradient fill data bars. To validate the PathoTracker result, the pure culture isolate NGS sequencing data serves as the reference. “vir +” or “ARGs +” indicates the presence of virulence genes or ARGs in the CMg sample strain. “Sample vir +” or “Sample ARG +” indicates these genes can be detected directly from raw data after 1 h of sequencing compared with VFDB and Resfinder database. “PathoTracker vir +” or “PathoTracker ARG +” indicates these genes can be detected by PathoTracker in data after 1-h nanopore sequencing. “PathoTracker/FastANI ST (sample/strain)” indicates the detection of the strain’s ST type from CMg sample and strains obtained from cultures, while “KL” indicates the detection of the KL type. Coloured sections in the heatmap indicate successful detection or the presence of virulence or ARGs. The samples are sorted from top to bottom by the number of reads obtained after 1 h of Nanopore sequencing. b Runtime of CMg samples for PathoTracker and FastANI. c Runtime of bootstrap validation samples for PathoTracker and FastANI. The center line represents the median, the lower and upper hinges correspond to the first and third quartiles, the whiskers extend no further than 1.5 times the interquartile range, and the points are outliers. Welch two sample t-test were used to compare PathoTracker and FastANI. ****p  <  0.001.
Fig. 4
Fig. 4. Association between clusters and their corresponding ST- and KL-type.
Clusters with ≥ 15 isolates (1,187 isolates) are shown. Figure S1 displays all remaining clusters with < 15 isolates and their respective ST-KL types.
Fig. 5
Fig. 5. PathoTracker web interface.
a Website homepage. b Details page of PathoTracker analysis. c Results page of PathoTracker analysis. d Results page of species detection analysis. e Results page of phylogenetic analysis. For phylogenetic analysis, four isolates from the database were selected as an example (C2111, C2155, C64 and C2304). For PathoTracker and species detection analysis, sample BAL177 was selected as an example.

Similar articles

References

    1. Guo, Y. et al. Metagenomic next-generation sequencing to identify pathogens and cancer in lung biopsy tissue. EBioMedicine73, 103639 (2021). 10.1016/j.ebiom.2021.103639 - DOI - PMC - PubMed
    1. Deng, X. et al. Metagenomic sequencing with spiked primer enrichment for viral diagnostics and genomic surveillance. Nat. Microbiol5, 443–454 (2020). 10.1038/s41564-019-0637-9 - DOI - PMC - PubMed
    1. Chen, H. et al. Clinical Utility of In-house Metagenomic Next-generation Sequencing for the Diagnosis of Lower Respiratory Tract Infections and Analysis of the Host Immune Response. Clin. Infect. Dis.71, S416–S426 (2020). 10.1093/cid/ciaa1516 - DOI - PubMed
    1. Jing, C. et al. Clinical Evaluation of an Improved Metagenomic Next-Generation Sequencing Test for the Diagnosis of Bloodstream Infections. Clin. Chem.67, 1133–1143 (2021). 10.1093/clinchem/hvab061 - DOI - PubMed
    1. Gu, W. et al. Rapid pathogen detection by metagenomic next-generation sequencing of infected body fluids. Nat. Med.27, 115–124 (2021). 10.1038/s41591-020-1105-z - DOI - PMC - PubMed

Publication types

LinkOut - more resources