Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 28:27:2347-2358.
doi: 10.1016/j.csbj.2025.05.039. eCollection 2025.

ToxiPep: Peptide toxicity prediction via fusion of context-aware representation and atomic-level graph

Affiliations

ToxiPep: Peptide toxicity prediction via fusion of context-aware representation and atomic-level graph

Jiahui Guan et al. Comput Struct Biotechnol J. .

Abstract

Peptide-based therapeutics have emerged as a promising avenue in drug development, offering high biocompatibility, specificity, and efficacy. However, the potential toxicity of peptides remains a significant challenge, necessitating the development of robust toxicity prediction methods. In this study, we introduce ToxiPep, a novel dual-model framework for peptide toxicity prediction that integrates sequence-based contextual information with atomic-level structural features. This framework combines BiGRU and Transformer to capture local and global sequence dependencies while leveraging multi-scale CNNs to extract refined structural features from molecular graphs derived from peptide SMILES representations. A cross-attention mechanism aligns and fuses these two feature modalities, enabling the model to capture intricate relationships between sequence and structural information. ToxiPep outperforms several state-of-the-art tools, including ToxinPred2, CSM-Toxin, PepNet, and ToxinPred3, on both internal and independent test sets. Additionally, interpretability analyses reveal that ToxiPep identifies key amino acids along with their structural features, providing insights into the molecular mechanisms of peptide toxicity. To facilitate broader accessibility, we have also developed a web server for convenient user access. Overall, this framework has the potential to accelerate the identification of safer therapeutic peptides, offering new opportunities for peptide-based drug development in precision medicine.

Keywords: Deep learning; Drug discovery; Peptide bioactivity prediction; Sequence modeling.

PubMed Disclaimer

Conflict of interest statement

The authors have declared no conflict of interest.

Figures

None
Graphical abstract
Fig. 1
Fig. 1
Framework of the ToxiPep. The model fuses sequence-based contextual representations and atomic-level structural features for enhanced peptide classification. Sequence embeddings are processed by BiGRU and Transformer encoder, while SMILES-derived graph features are refined via multi-scale convolutions. Cross-attention aligns both modalities, and an MLP classifier generates the final prediction.
Fig. 2
Fig. 2
Amino acid composition differences between toxic and non-toxic peptides (CD-HIT threshold = 0.9) and their statistical significance. (A) Mean amino acid composition of toxic and non-toxic peptides. (B) Bonferroni-corrected p-values based on Wilcoxon rank-sum tests of amino acid composition differences. (C) Log-transformed corrected p-values representing the level of statistical significance for each amino acid.
Fig. 3
Fig. 3
Performance comparison with machine learning and deep learning baseline models. Panels A–C present radar plots comparing the proposed model ToxiPep with three categories of baseline methods: (A) machine learning models using handcrafted features, (B) deep learning architectures trained on one-hot encoded sequences, and (C) pretrained protein language models.
Fig. 4
Fig. 4
Performance comparison in ablation analysis. (A) and (B) Performance of ToxiPep compared with different ablation models, including Transformer, GRU + Transformer, Atomic-level graph with multi-scale CNN, Concatenation fusion, and Late fusion.
Fig. 5
Fig. 5
UMAP visualizations of feature representations derived from different model modules under a single random seed. (A) Sequence embedding (silhouette = 0.00). (B) Structural encoding (silhouette = 0.00). (C) Context-aware sequence processing (silhouette = 0.33). (D) Atomic-level graph processing (silhouette = 0.34). (E) Without cross-attention fusion (silhouette = 0.41). (F) ToxiPep (silhouette = 0.44).
Fig. 6
Fig. 6
Interpretability analysis of the model in predicting toxic peptides. (A) Attention weights of amino acids in a representative toxic peptide sequence. (B–E) Heatmaps generated by CAM analysis for CNN. (F–I) Molecular structure analysis of key amino acids.
Fig. 7
Fig. 7
Demonstration of the ToxiPep web interface for peptide toxicity prediction. (1) Starting the Prediction: Users initiate the process by clicking the “Start Prediction” button, which opens a dedicated page for sequence input. (2) Input protein sequence: On the new page, users can enter peptide sequences directly into the text box or upload a file containing multiple sequences using the “Upload” button. This flexible input option supports both single and batch predictions. (3) Running the Prediction: After inputting the sequences, users click the “Start Prediction” button to submit them for analysis. (4) Results Panel: The results page presents the prediction outcomes, including the toxicity classification for each peptide and confidence scores from the model, displayed in an intuitive format.

Similar articles

Cited by

References

    1. Craik David J., Fairlie David P., Liras Spiros, Price David. The future of peptide-based drugs. Chem Biol Drug Des. 2013;81(1):136–147. - PubMed
    1. Rastogi Shruti, Shukla Shatrunajay, Kalaivani M., Singh Gyanendra Nath. Peptide-based therapeutics: quality specifications, regulatory considerations, and prospects. Drug Discov Today. 2019;24(1):148–162. - PubMed
    1. Cicero Arrigo F.G., Fogacci Federica, Colletti Alessandro. Potential role of bioactive peptides in prevention and treatment of chronic diseases: a narrative review. Br J Pharmacol. 2017;174(11):1378–1394. - PMC - PubMed
    1. Wagner Angela M., Gran Margaret P., Peppas Nicholas A. Designing the new generation of intelligent biocompatible carriers for protein and peptide delivery. Acta Pharm Sin B. 2018;8(2):147–164. - PMC - PubMed
    1. Haggag Yusuf A., Donia Ahmed A., Osman Mohamed A., El-Gizawy Sanaa A. Peptides as drug candidates: limitations and recent development perspectives. Biomed J. 2018;1(3)

LinkOut - more resources