ProMethylNet: Protein Methylation Site Prediction Based on Multimodal Feature Fusion and Deep Learning
- PMID: 40811205
- DOI: 10.1109/TCBBIO.2025.3575783
ProMethylNet: Protein Methylation Site Prediction Based on Multimodal Feature Fusion and Deep Learning
Abstract
Protein methylation is a fundamental post-translational modification (PTM) that plays critical roles in gene regulation, protein interactions, and cellular signaling pathways. However, traditional experimental approaches for identifying methylation sites are time-consuming, labor-intensive, and limited in throughput. Therefore, there is an urgent need for computational methods that efficiently and accurately predict methylation sites. In this study, we introduce ProMethylNet, a deep learning framework designed for predicting protein methylation sites. ProMethylNet integrates multimodal features, including one-hot encoding, amino acid physicochemical properties, ProtBERT embeddings, and position-specific scoring matrices(PSSM), to enrich sequence representation. The framework incorporates multi-scale convolutional attention networks(MSCANet), graph attention networks(GAT), and bidirectional long short-term memory networks(BiLSTM), effectively capturing local sequence motifs, structural information, and long-range dependencies within protein sequences. To mitigate the issue of class imbalance in the dataset, ProMethylNet incorporates a weighted binary cross-entropy loss function during training. External validation on an independent subset(20%) of the UniProtKB dataset and the PLMD 3.0 dataset indicates that ProMethylNet achieves notable improvements over existing methods in F1-score, Matthews correlation coefficient(MCC), and area under the curve(AUC), with an approximate increase of 20 percentage points. These results suggest that ProMethylNet demonstrates a certain level of robustness and efficiency, indicating its potential applicability in protein methylation site prediction, functional annotation, and broader biomedical research.