Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 23;25(6):bbae484.
doi: 10.1093/bib/bbae484.

DeepPBI-KG: a deep learning method for the prediction of phage-bacteria interactions based on key genes

Affiliations

DeepPBI-KG: a deep learning method for the prediction of phage-bacteria interactions based on key genes

Tongqing Wei et al. Brief Bioinform. .

Abstract

Phages, the natural predators of bacteria, were discovered more than 100 years ago. However, increasing antimicrobial resistance rates have revitalized phage research. Methods that are more time-consuming and efficient than wet-laboratory experiments are needed to help screen phages quickly for therapeutic use. Traditional computational methods usually ignore the fact that phage-bacteria interactions are achieved by key genes and proteins. Methods for intraspecific prediction are rare since almost all existing methods consider only interactions at the species and genus levels. Moreover, most strains in existing databases contain only partial genome information because whole-genome information for species is difficult to obtain. Here, we propose a new approach for interaction prediction by constructing new features from key genes and proteins via the application of K-means sampling to select high-quality negative samples for prediction. Finally, we develop DeepPBI-KG, a corresponding prediction tool based on feature selection and a deep neural network. The results show that the average area under the curve for prediction reached 0.93 for each strain, and the overall AUC and area under the precision-recall curve reached 0.89 and 0.92, respectively, on the independent test set; these values are greater than those of other existing prediction tools. The forward and reverse validation results indicate that key genes and key proteins regulate and influence the interaction, which supports the reliability of the model. In addition, intraspecific prediction experiments based on Klebsiella pneumoniae data demonstrate the potential applicability of DeepPBI-KG for intraspecific prediction. In summary, the feature engineering and interaction prediction approaches proposed in this study can effectively improve the robustness and stability of interaction prediction, can achieve high generalizability, and may provide new directions and insights for rapid phage screening for therapy.

Keywords: deep learning; machine learning; negative sample selection; phage-bacteria interaction; receptor binding protein.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
Figure 1
Figure 1
Summary statistics of the data. (A) Phage infection corresponding to the host of each phylum in the proportion toroidal histogram. The phage was statistically classified according to the name of the host species corresponding to the infection and displayed according to the information of the host phylum, wherein hosts with fewer than three members under each phylum were merged into others, and phyla containing too few host species were merged into other phyla. (B) Evolutionary tree of 1337 phage infected with Mycolicibacterium. Owing to the excessive number of phages involved and the complexity of the evolutionary tree, the evolutionary tree was pruned, and the subtree containing too many phages was merged. The number shown in the figure represents the number of phages contained in the subtree with the node as the root; the node showing the specific phage ID indicates that there is no subtree under the node. The evolutionary tree of Mycolicibacterium-infecting phages was divided into four clusters, and the number of phages contained in each cluster was labelled. (C) Distribution of phage in the interactions dataset. (D) Distribution of host in the interactions dataset.
Figure 2
Figure 2
Feature construction and model framework. (A) DNA-protein feature construction process. (B) Key_gene feature construction process. (C) Model architecture flowchart.
Figure 3
Figure 3
Model prediction performance, feature importance screening, and enrichment analysis results. (A) Performance of the model for each evaluation metric in the 2436 test sets. (B) Distribution of phage and bacterial feature importance and the division of phage and bacterial feature importance under the RF model. The red axis marks in the figure represent the contribution threshold set according to PCA. (C) GO enrichment analysis dot plot and KEGG enrichment analysis dot plot of phage genes with the top feature importance values. Important pathways are marked in red. (D) GO enrichment analysis network plot and KEGG enrichment analysis network plot of host genes with the top feature importance values. Important pathways are marked in red.
Figure 4
Figure 4
Comparison of DeepPBI-KG with state-of-the-art methods and AUC of each strain. (A) Comparison of the receiver operating characteristic (ROC) curve performance of different classifiers. (B) Comparison of the PR curves of different classifiers. (C) Radar map of different classifiers for accuracy, precision, recall, F1 score, and specificity. (D) The AUCs of DeepPBI-KG and PB-LKS were plotted for each strain. The circle under each bar represents DeepPBI-KG and the square represents PB-LKS.

Similar articles

Cited by

References

    1. Khan S, Zakariah M, Rolfo C. et al. . Prediction of mycoplasma hominis proteins targeting in mitochondria and cytoplasm of host cells and their implication in prostate cancer etiology. Oncotarget 2017;8:30830. - PMC - PubMed
    1. Dreyfuss D, Ricard J-D. Acute lung injury and bacterial infection. Clin Chest Med 2005;26:105–12. - PubMed
    1. Zhang C, Liu H, Sun L. et al. . An overview of host-derived molecules that interact with gut microbiota. iMeta 2023;2:e88. - PMC - PubMed
    1. Toke O. Antimicrobial peptides: new candidates in the fight against bacterial infections. Pept Sci Orig Res Biomol 2005;80:717–35. - PubMed
    1. Edwards RA, McNair K, Faust K. et al. . Computational approaches to predict bacteriophage-host relationships. FEMS Microbiol Rev 2016;40:258–72. 10.1093/femsre/fuv048. - DOI - PMC - PubMed