Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Jan 7:7:1497307.
doi: 10.3389/frai.2024.1497307. eCollection 2024.

A bird's-eye view of the biological mechanism and machine learning prediction approaches for cell-penetrating peptides

Affiliations
Review

A bird's-eye view of the biological mechanism and machine learning prediction approaches for cell-penetrating peptides

Maduravani Ramasundaram et al. Front Artif Intell. .

Abstract

Cell-penetrating peptides (CPPs) are highly effective at passing through eukaryotic membranes with various cargo molecules, like drugs, proteins, nucleic acids, and nanoparticles, without causing significant harm. Creating drug delivery systems with CPP is associated with cancer, genetic disorders, and diabetes due to their unique chemical properties. Wet lab experiments in drug discovery methodologies are time-consuming and expensive. Machine learning (ML) techniques can enhance and accelerate the drug discovery process with accurate and intricate data quality. ML classifiers, such as support vector machine (SVM), random forest (RF), gradient-boosted decision trees (GBDT), and different types of artificial neural networks (ANN), are commonly used for CPP prediction with cross-validation performance evaluation. Functional CPP prediction is improved by using these ML strategies by using CPP datasets produced by high-throughput sequencing and computational methods. This review focuses on several ML-based CPP prediction tools. We discussed the CPP mechanism to understand the basic functioning of CPPs through cells. A comparative analysis of diverse CPP prediction methods was conducted based on their algorithms, dataset size, feature encoding, software utilities, assessment metrics, and prediction scores. The performance of the CPP prediction was evaluated based on accuracy, sensitivity, specificity, and Matthews correlation coefficient (MCC) on independent datasets. In conclusion, this review will encourage the use of ML algorithms for finding effective CPPs, which will have a positive impact on future research on drug delivery and therapeutics.

Keywords: artificial neural network; cell-penetrating peptides; machine learning; mechanism; random forest; support vector machine.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Illustration of the basic mechanism of cell-penetrating peptides for intracellular invagination into cells. Cargo can be a drug, protein, micromolecule, siRNA, etc. (created with BioRender.com/k22o427).
Figure 2
Figure 2
Depiction of existing framework in CPP prediction methods. It represents some of the best predictors on training and independent datasets with feature encodings (AAC, PCP, DPC, CKSAAGP, QSO, etc.) selected for prediction. Different ML classifiers SVM, RF, LGBM, GB, SNN, PractiCPP, and CNN-BiLSTM achieved higher accuracy around 93–97%. The CV technique was utilized for model evaluation. CKSAAGP, composition of k-spaced amino acid group pairs; QSO, quasi sequence order; CV, cross-validation; AAC, amino acid composition; DPC, dipeptide composition; BPP, binary profiles of pattern; PCP, physicochemical properties; SF, sequential features; LSF, local structure features; PTF, pretrained features; GB, gradient boosting; LGBM, light gradient boosting machine; SVM, support vector machine; SNN, siamese neural network; RF, random forest; CNN-BiLSTM, convolutional neural network-bidirectional long short-term memory.
Figure 3
Figure 3
Comparison of size of training and independent datasets used on state-of-the-art methods for CPP prediction.
Figure 4
Figure 4
Comparison results of existing CPP prediction methods on training datasets. Accuracy (ACC), sensitivity (SN), specificity (SP), and Matthews correlation coefficient (MCC).
Figure 5
Figure 5
Comparison results of existing CPP prediction methods on independent datasets. Accuracy (ACC), sensitivity (SN), specificity (SP), and Matthews correlation coefficient (MCC).

Similar articles

References

    1. Agrawal P., Bhalla S., Usmani S. S., Singh S., Chaudhary K., Raghava G. P., et al. . (2016). CPPsite 2.0: a repository of experimentally validated cell-penetrating peptides. Nucleic Acids Res. 44, D1098–D1103. doi: 10.1093/nar/gkv1266, PMID: - DOI - PMC - PubMed
    1. Alves I. D., Jiao C. Y., Aubry S., Aussedat B., Burlina F., Chassaing G., et al. . (2010). Cell biology meets biophysics to unveil the different mechanisms of penetratin internalization in cells. Biochimica et Biophysica Acta (BBA)-Biomembranes 1798, 2231–2239. doi: 10.1016/j.bbamem.2010.02.009, PMID: - DOI - PubMed
    1. Arif M., Ahmad S., Ali F., Fang G., Li M., Yu D. J. (2020). TargetCPP: accurate prediction of cell-penetrating peptides from optimized multi-scale features using gradient boost decision tree. J. Comput. Aided Mol. Des. 34, 841–856. doi: 10.1007/s10822-020-00307-z, PMID: - DOI - PubMed
    1. Arif M., Kabir M., Ahmed S., Khan A., Ge F., Khelifi A., et al. . (2021). DeepCPPred: a deep learning framework for the discrimination of cell-penetrating peptides and their uptake efficiencies. IEEE/ACM Trans. Comput. Biol. Bioinform. 19, 2749–2759. doi: 10.1109/TCBB.2021.3102133 - DOI - PubMed
    1. Arukuusk P., Pärnaste L., Oskolkov N., Copolovici D. M., Margus H., Padari K., et al. . (2013). New generation of efficient peptide-based vectors, NickFects, for the delivery of nucleic acids. Biochimica et Biophysica Acta (BBA)-Biomembranes 1828, 1365–1373. doi: 10.1016/j.bbamem.2013.01.011, PMID: - DOI - PubMed

LinkOut - more resources