Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 20;22(4):bbaa301.
doi: 10.1093/bib/bbaa301.

DeepBL: a deep learning-based approach for in silico discovery of beta-lactamases

Affiliations

DeepBL: a deep learning-based approach for in silico discovery of beta-lactamases

Yanan Wang et al. Brief Bioinform. .

Abstract

Beta-lactamases (BLs) are enzymes localized in the periplasmic space of bacterial pathogens, where they confer resistance to beta-lactam antibiotics. Experimental identification of BLs is costly yet crucial to understand beta-lactam resistance mechanisms. To address this issue, we present DeepBL, a deep learning-based approach by incorporating sequence-derived features to enable high-throughput prediction of BLs. Specifically, DeepBL is implemented based on the Small VGGNet architecture and the TensorFlow deep learning library. Furthermore, the performance of DeepBL models is investigated in relation to the sequence redundancy level and negative sample selection in the benchmark dataset. The models are trained on datasets of varying sequence redundancy thresholds, and the model performance is evaluated by extensive benchmarking tests. Using the optimized DeepBL model, we perform proteome-wide screening for all reviewed bacterium protein sequences available from the UniProt database. These results are freely accessible at the DeepBL webserver at http://deepbl.erc.monash.edu.au/.

Keywords: antimicrobial resistance; beta-lactamase; bioinformatics; deep learning; sequence homology.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The architecture of the DeepBL methodology. The development of DeepBL involves four major stages, including (A) data curation, where all ‘BL’ sequences annotated in the NCBI RefSeq database are extracted and used as the positive samples, whereas 0.015% of all ‘Not BL’ sequences are randomly chosen as the negative samples to constitute the ‘Not-BL’ subset; (B) feature encoding, where the sequence encoding scheme CKSAAP is applied to encode the sequence of proteins in the benchmark dataset; (B) model training, where the model architecture is built, model hyperparameters are optimized and training strategies are compared and (D) performance evaluation, where the performance of DeepBL models is assessed by performing 10-fold cross-validation and independent tests. (E) shows the detailed architecture of the Small VGGNet deep learning framework.
Figure 2
Figure 2
The relationship of the number of remaining sequences in the BL sequence datasets clustered by CD-HIT in accordance with different sequence identity cutoff thresholds, which ranged from 0.7 to 1.0.
Figure 3
Figure 3
ROC curves and PR curves on the 10-fold cross-validation test. The AUC values of the ROC curves and PR curves are calculated with average and SD values.
Figure 4
Figure 4
Boxplots of performance results on the 10-fold cross-validation tests. Performance metrics monitored during model training included (a): Loss value, (b): Categorical accuracy, F1-Score and MCC.
Figure 5
Figure 5
ROC and PR curves for the prediction of Classes A, B, C, D and Not-BLs on the independent test datasets.
Figure 6
Figure 6
Confusion matrix of predicted results on the independent test. The matrix represents the distribution of the outputs for each of the five BL classes. Correctly predicted numbers are highlighted and shown on the diagonal line.
Figure 7
Figure 7
Relationship between the resulting AUC values and varying sequence identity thresholds on the benchmark datasets.
Figure 8
Figure 8
Boxplots of F1-Score, MCC and ACC on the datasets with varying sequence identity thresholds.
Figure 9
Figure 9
Statistics of proteome-wide prediction of BLs and their respective classes by applying the optimized DeepBL model.
Figure 10
Figure 10
Prediction outputs of the two novel BLs in the BKC-1, PAD-1 case study.
Figure 11
Figure 11
Screenshot of the web pages of the DeepBL webserver. (A) The input page of DeepBL; (B) the submission page of DeepBL and (C) the prediction output interface of DeepBL.

References

    1. Drawz SM, Bonomo RA. Three decades of β-lactamase inhibitors. Clin Microbiol Rev 2010;23:160–201. - PMC - PubMed
    1. Demain AL, Blander RP. The β-lactam antibiotics: past, present, and future. Antonie van Leeuwenhoek. Int J Gen Mol Microbiol 1999;75:5–19. - PubMed
    1. Fisher JF, Knowles JR. Bacterial resistance to β-lactams: the β-lactamases. Annu Rep Med Chem 1978;13:239–48.
    1. Bush K. Past and present perspectives on β-lactamases. Antimicrob Agents Chemother 2018;62:e01076-18. - PMC - PubMed
    1. Bush K, Bradford PA. β-Lactams and β-lactamase inhibitors: an overview. Cold Spring Harb Perspect Med 2016;6:a025247. - PMC - PubMed

Publication types

LinkOut - more resources