Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 26;8(1):1697.
doi: 10.1038/s41598-018-19752-w.

AmPEP: Sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest

Affiliations

AmPEP: Sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest

Pratiti Bhadra et al. Sci Rep. .

Abstract

Antimicrobial peptides (AMPs) are promising candidates in the fight against multidrug-resistant pathogens owing to AMPs' broad range of activities and low toxicity. Nonetheless, identification of AMPs through wet-lab experiments is still expensive and time consuming. Here, we propose an accurate computational method for AMP prediction by the random forest algorithm. The prediction model is based on the distribution patterns of amino acid properties along the sequence. Using our collection of large and diverse sets of AMP and non-AMP data (3268 and 166791 sequences, respectively), we evaluated 19 random forest classifiers with different positive:negative data ratios by 10-fold cross-validation. Our optimal model, AmPEP with the 1:3 data ratio, showed high accuracy (96%), Matthew's correlation coefficient (MCC) of 0.9, area under the receiver operating characteristic curve (AUC-ROC) of 0.99, and the Kappa statistic of 0.9. Descriptor analysis of AMP/non-AMP distributions by means of Pearson correlation coefficients revealed that reduced feature sets (from a full-featured set of 105 to a minimal-feature set of 23) can result in comparable performance in all respects except for some reductions in precision. Furthermore, AmPEP outperformed existing methods in terms of accuracy, MCC, and AUC-ROC when tested on benchmark datasets.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Pearson correlation coefficients (PCCs) between AMP and non-AMP distributions of the same descriptor in the Mmodel_train dataset.
Figure 2
Figure 2
Performance of RF classifiers during 10-fold cross-validation on datasets with different AMP/non-AMP ratios.
Figure 3
Figure 3
Illustration of the calculations of DF with a sample antibacterial peptide.

References

    1. Park S-C, Park Y, Hahm K-S. The role of antimicrobial peptides in preventing multidrug-resistant bacterial infections and biofilm formation. International Journal of Molecular Sciences. 2011;12:5971–92. doi: 10.3390/ijms12095971. - DOI - PMC - PubMed
    1. Hammami R, Fliss I. Current trends in antimicrobial agent research: chemo- and bioinformatics approaches. Drug Discovery Today. 2010;15:540–546. doi: 10.1016/j.drudis.2010.05.002. - DOI - PubMed
    1. Waghu FH, Gopi L, Barai RS, nd Bilal Nizami PR, Idicula-Thomas S. CAMP: Collection of sequences and structures of antimicrobial peptides. Nucleic Acids Research. 2014;42:D1154–D1158. doi: 10.1093/nar/gkt1157. - DOI - PMC - PubMed
    1. Waghu FH, Barai RS, Gurung P, Idicula-Thomas S. CAMPR3: a database on sequences, structures and signatures of antimicrobial peptides. Nucleic Acids Research. 2016;44:D1094–D1097. doi: 10.1093/nar/gkv1051. - DOI - PMC - PubMed
    1. Xiao X, Wang P, Lin W-Z, Jia J-H, Chou K-C. iAMP-2L: A two-level multi-label classifier for identifying antimicrobial peptides and their functional types. Analytical Biochemistry. 2013;436:168–177. doi: 10.1016/j.ab.2013.01.019. - DOI - PubMed

Publication types

MeSH terms

Substances