Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 22;11(3):471.
doi: 10.3390/biom11030471.

Deep Learning for Novel Antimicrobial Peptide Design

Affiliations

Deep Learning for Novel Antimicrobial Peptide Design

Christina Wang et al. Biomolecules. .

Abstract

Antimicrobial resistance is an increasing issue in healthcare as the overuse of antibacterial agents rises during the COVID-19 pandemic. The need for new antibiotics is high, while the arsenal of available agents is decreasing, especially for the treatment of infections by Gram-negative bacteria like Escherichia coli. Antimicrobial peptides (AMPs) are offering a promising route for novel antibiotic development and deep learning techniques can be utilised for successful AMP design. In this study, a long short-term memory (LSTM) generative model and a bidirectional LSTM classification model were constructed to design short novel AMP sequences with potential antibacterial activity against E. coli. Two versions of the generative model and six versions of the classification model were trained and optimised using Bayesian hyperparameter optimisation. These models were used to generate sets of short novel sequences that were classified as antimicrobial or non-antimicrobial. The validation accuracies of the classification models were 81.6-88.9% and the novel AMPs were classified as antimicrobial with accuracies of 70.6-91.7%. Predicted three-dimensional conformations of selected short AMPs exhibited the alpha-helical structure with amphipathic surfaces. This demonstrates that LSTMs are effective tools for generating novel AMPs against targeted bacteria and could be utilised in the search for new antibiotics leads.

Keywords: Escherichia coli; antimicrobial peptides; deep learning; long short-term memory; machine learning; peptide design.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Workflow for the machine learning process. (1) Data for the positive set were collected from antimicrobial peptide (AMP) databases Collection of Anti-Microbial Peptides (CAMP), Database of Antimicrobial Activity and Structure of Peptides (DBAASP), Data Repository of Antimicrobial Peptides (DRAMP), and Yet Another Database of Antimicrobial Peptides (YADAMP). Two negative data sets were collected, from AMP databases and UniProt, respectively. (2) Two versions of the generative model were trained, tuned, and tested on the positive data set. (3) Six versions of the classification model were trained, tuned, and tested on the positive data set, as well as either the UniProt or AMP negative data set, using different minimal inhibitory concentration (MIC) cut-offs. (4) The optimised versions of the generative model ultimately produced two sets of AMP sequences. These sequences were then classified by the six optimised versions of the classification model, resulting in 12 sets of predictions.
Figure 2
Figure 2
Data processing (a) and model structures (b). (a) Sequences had a maximum length of 20 residues and were zero-padded if their length was under 20 residues. Subsequently, the sequences were one-hot encoded, after which they were ready to be input into the deep learning models. (b) The generative long short-term memory (LSTM) model started with an LSTM input layer. A dropout layer was added to reduce overfitting and a dense layer was the last layer, responsible for outputting probabilities. The bidirectional LSTM classification model started with a bidirectional LSTM input layer, after which a dropout layer was applied. A flatten layer was added to reduce the dimensions of the input data, and a dense layer was responsible for the output of predictions.
Figure 3
Figure 3
Receiver operating characteristic (ROC) curves and areas under the curve (AUCs) of each version of the classification model. The best threshold is represented by a red dot. The left three figures (a,c,e) are from the models trained with negatives from antimicrobial peptide (AMP) databases (model Version 1–3), the right three (b,d,f) with negatives from UniProt (model Version 4–6). From top to bottom, the models were trained with a minimal inhibitory concentration (MIC) cut-off of 100 (a,b), 50 (c,d), and 10 µM (e,f). The green diagonal line represents the no-discrimination line, where the model is of no use, and the false positive rate is equal to the true positive rate.
Figure 4
Figure 4
Confusion matrices of each version of the classification model. The left three figures (a,c,e) are from the models trained with negatives from antimicrobial peptide (AMP) databases (model Version 1–3), the right three (b,d,f) with negatives from UniProt (model Version 4–6). From top to bottom, the models were trained with a minimal inhibitory concentration (MIC) cut-off of 100 (a,b), 50 (c,d), and 10 µM (e,f).
Figure 5
Figure 5
Secondary structures and surface representations of sequences IWRVWRRW (a,b), KRWWIRWR (c,d), APLKQLKW (e,f), PFKKSIHL (g,h), and APWKQLKW (i,j). The residues are labelled according to their one-letter abbreviation. Surface representations show the hydrophilicity (blue) and hydrophobicity (orange) of the peptides. White surfaces represent a hydrophobicity of around 0.0.

Similar articles

Cited by

References

    1. Petrosillo N. Infections: The Emergency of the New Millennium. In: Signore A., Glaudemans A.W.J.M., editors. Nuclear Medicine in Infectious Diseases. Springer; Berlin/Heidelberg, Germany: 2019. pp. 1–8.
    1. O’Neil J. Review on Antimicrobial Resistance. AMR-Review; London, UK: 2016. Tackling drug-resistant infections globally: Final report and recommendations; pp. 1–84.
    1. Sohrabi C., Alsafi Z., O’Neill N., Khan M., Kerwan A., Al-Jabir A., Iosifidis C., Agha R. World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19) Int. J. Surg. 2020;76:71–76. doi: 10.1016/j.ijsu.2020.02.034. - DOI - PMC - PubMed
    1. International Severe Acute Respiratory and Emerging Infection Consortium COVID-19 Report. [(accessed on 10 June 2020)];2020 Available online: https://media.tghn.org/medialibrary/2020/04/ISARIC_Data_Platform_COVID-1....
    1. Chen N., Zhou M., Dong X., Qu J., Gong F., Han Y., Qiu Y., Wang J., Liu Y., Wei Y., et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: A descriptive study. Lancet. 2020;395:507–513. doi: 10.1016/S0140-6736(20)30211-7. - DOI - PMC - PubMed

Substances

LinkOut - more resources