Cell-penetrating peptides predictors: A comparative analysis of methods and datasets
- PMID: 37672879
- DOI: 10.1002/minf.202300104
Cell-penetrating peptides predictors: A comparative analysis of methods and datasets
Abstract
Cell-Penetrating Peptides (CPP) are emerging as an alternative to small-molecule drugs to expand the range of biomolecules that can be targeted for therapeutic purposes. Due to the importance of identifying and designing new CPP, a great variety of predictors have been developed to achieve these goals. To establish a ranking for these predictors, a couple of recent studies compared their performances on specific datasets, yet their conclusions cannot determine if the ranking obtained is due to the model, the set of descriptors or the datasets used to test the predictors. We present a systematic study of the influence of the peptide sequence's similarity of the datasets on the predictors' performance. The analysis reveals that the datasets used for training have a stronger influence on the predictors performance than the model or descriptors employed. We show that datasets with low sequence similarity between the positive and negative examples can be easily separated, and the tested classifiers showed good performance on them. On the other hand, a dataset with high sequence similarity between CPP and non-CPP will be a hard dataset, and it should be the one to be used for assessing the performance of new predictors.
Keywords: cell-penetrating peptides; datasets; machine learning; peptide sequence similarity.
© 2023 The Authors. Molecular Informatics published by Wiley-VCH GmbH.
References
-
- F. Atyabi, F. Zahir, F. Khonsari, A. Shafiee, F. Mottaghitalab, in Nanostructures for Cancer Therapy, 2017.
-
- A. O. Tzianabos, Clin. Microbiol. Rev. 2000, 13, DOI 10.1128/CMR.13.4.523-533.2000.
-
- M. C. Perez-Matos, M. C. Morales-Alvarez, C. O. Mendivil, Journal of Diabetes Research 2017, 2017, DOI 10.1155/2017/6943851.
-
- G. C. Terstappen, C. Schlüpen, R. Raggiaschi, G. Gaviraghi, Nat. Rev. Drug Discovery 2007, 6, DOI 10.1038/nrd2410.
-
- A. L. Hopkins, C. R. Groom, Nat. Rev. Drug Discovery 2002, 1, DOI 10.1038/nrd892.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
