. 2025 Jul 11;26(1):176.

doi: 10.1186/s12859-025-06197-y.

RMDNet: RNA-aware dung beetle optimization-based multi-branch integration network for RNA-protein binding sites prediction

Jiangbo Zhang¹, Yunhui Peng², Feifei Cui¹, Zilong Zhang¹, Shankai Yan³, Qingchen Zhang⁴

Affiliations

¹ School of Computer Science and Technology, Hainan University, Haikou, 570100, Hainan, China.
² School of Physics Science and Technology, Central China Normal University, Wuhan, 430000, Hubei, China.
³ School of Computer Science and Technology, Hainan University, Haikou, 570100, Hainan, China. skyan@hainanu.edu.cn.
⁴ School of Computer Science and Technology, Hainan University, Haikou, 570100, Hainan, China. zhangqingchen@hainanu.edu.cn.

PMID: 40646507
PMCID: PMC12247420
DOI: 10.1186/s12859-025-06197-y

RMDNet: RNA-aware dung beetle optimization-based multi-branch integration network for RNA-protein binding sites prediction

Jiangbo Zhang et al. BMC Bioinformatics. 2025.

. 2025 Jul 11;26(1):176.

doi: 10.1186/s12859-025-06197-y.

Authors

Jiangbo Zhang¹, Yunhui Peng², Feifei Cui¹, Zilong Zhang¹, Shankai Yan³, Qingchen Zhang⁴

Affiliations

¹ School of Computer Science and Technology, Hainan University, Haikou, 570100, Hainan, China.
² School of Physics Science and Technology, Central China Normal University, Wuhan, 430000, Hubei, China.
³ School of Computer Science and Technology, Hainan University, Haikou, 570100, Hainan, China. skyan@hainanu.edu.cn.
⁴ School of Computer Science and Technology, Hainan University, Haikou, 570100, Hainan, China. zhangqingchen@hainanu.edu.cn.

PMID: 40646507
PMCID: PMC12247420
DOI: 10.1186/s12859-025-06197-y

Abstract

RNA-binding proteins (RBPs) play crucial roles in gene regulation. Their dysregulation has been increasingly linked to neurodegenerative diseases, liver cancer, and lung cancer. Although experimental methods like CLIP-seq accurately identify RNA-protein binding sites, they are time-consuming and costly. To address this, we propose RMDNet-a deep learning framework that integrates CNN, CNN-Transformer, and ResNet branches to capture features at multiple sequence scales. These features are fused with structural representations derived from RNA secondary structure graphs. The graphs are processed using a graph neural network with DiffPool. To optimize feature integration, we incorporate an improved dung beetle optimization algorithm, which adaptively assigns fusion weights during inference. Evaluations on the RBP-24 benchmark show that RMDNet outperforms state-of-the-art models including GraphProt, DeepRKE, and DeepDW across multiple metrics. On the RBP-31 dataset, it demonstrates strong generalization ability, while ablation studies on RBPsuite2.0 validate the contributions of individual modules. We assess biological interpretability by extracting candidate binding motifs from the first-layer CNN kernels. Several motifs closely match experimentally validated RBP motifs, confirming the model's capacity to learn biologically meaningful patterns. A downstream case study on YTHDF1 focuses on analyzing interpretable spatial binding patterns, using a large-scale prediction dataset and CLIP-seq peak alignment. The results confirm that the model captures localized binding signals and spatial consistency with experimental annotations. Overall, RMDNet is a robust and interpretable tool for predicting RNA-protein binding sites. It has broad potential in disease mechanism research and therapeutic target discovery. The source code is available https://github.com/cskyan/RMDNet .

Keywords: Convolutional neural network; Dung beetle optimizer; Feature fusion strategy; Graph neural network; Multi-branch deep learning network; RNA–protein binding sites.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

**Fig. 1**
Overall architecture of the proposed RMDNet framework. a Data preparation: RNA sequences are encoded using one-hot vectors and segmented by a sliding window strategy. b Secondary structure extraction: Each sequence is converted into a structural graph using RNAfold, and topology features are extracted via a GNN with DiffPool. c Feature fusion: Sequence features are learned from three parallel classifiers (CNN, CNN-Transformer, and ResNet), and are concatenated with structure features in the fully connected layer. d Weight assignment: During inference, the IDBO algorithm is applied to optimize fusion weights across branches by searching for the best combination

**Algorithm 1**
Inference Phase with IDBO-Based Weighted Fusion

**Fig. 2**
Comparison of average classification metrics among different models on the RBP-24 dataset. The bar chart includes eight standard metrics: AUC, PR-AUC, Accuracy, Precision, Recall, Specificity, F1-score, and MCC

**Fig. 3**
Distribution of AUC and MCC scores across different models on the RBP-24 dataset. Each boxplot summarizes the per-RBP performance for one evaluation metric, reflecting the stability and generalization ability of each model

**Fig. 4**
Comparison of AUC scores among different models on the RBP-31 dataset. The bar chart shows the AUC performance of four models—GraphProt, DeepRKE, DeepDW, and RMDNet—across 31 RNA-binding proteins

**Fig. 5**
Comparison of AUC, F1-score, and MCC scores across different ablation variants of RMDNet on CPSF1 and FBL datasets. Each bar represents the average performance under one evaluation metric, illustrating the contribution of individual components and the robustness of the full model

**Fig. 6**
Motifs learned by the CNN branch for YTHDF1. Sixteen sequence logos represent the binding patterns captured by each convolutional kernel

**Fig. 7**
Reliable motifs with significant matches to known RBP patterns. Nine sequence logos correspond to CNN kernels whose motifs aligned with entries in the CISBP-RNA database (E-value < 0.05)

**Fig. 8**
Visualization of binding site distribution and prediction score heatmap for YTHDF1 on real RNA sequences. The histogram shows the frequency of high-scoring binding positions (score ) across normalized RNA regions, while the heatmap aggregates 10,000 predicted sequences into 500 groups, highlighting consistent positional patterns. These visualizations reflect the model’s spatial sensitivity and downstream interpretability

formula image — **Fig. 8**
Visualization of binding site distribution and prediction score heatmap for YTHDF1 on real RNA sequences. The histogram shows the frequency of high-scoring binding positions (score ) across normalized RNA regions, while the heatmap aggregates 10,000 predicted sequences into 500 groups, highlighting consistent positional patterns. These visualizations reflect the model’s spatial sensitivity and downstream interpretability

See this image and copyright information in PMC

References

1. Soleymani F, Paquet E, Viktor H, Michalowski W, Spinello D. Protein–protein interaction prediction with deep learning: a comprehensive review. Comput Struct Biotechnol J. 2022;20:5316–41. 10.1016/j.csbj.2022.09.029. - DOI - PMC - PubMed
1. Zuo Y, Chen H, Yang L, Chen R, Zhang X, Deng Z. Research progress on prediction of RNA–protein binding sites in the past five years. Anal Biochem. 2024;691: 115535. 10.1016/j.ab.2023.115535. - DOI - PubMed
1. Shi X, Nordenskiöld L, Sokolova OS, Shaytan AK. Editorial: recent advances in molecular properties of DNA–protein interactions, chromatin and their biological roles. Front Mol Biosci. 2023;10:1–3. 10.3389/fmolb.2023.1234567. - DOI - PMC - PubMed
1. Lee M. Recent advances in deep learning for protein–protein interaction analysis: a comprehensive review. Molecules. 2023;28:1234. 10.3390/molecules28131234. - DOI - PMC - PubMed
1. Dhanuka R, Singh JP, Tripathi A. A comprehensive survey of deep learning techniques in protein function prediction. IEEE/ACM Trans Comput Biol Bioinform. 2023;20:2291–301. 10.1109/TCBB.2023.3245678. - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

RMDNet: RNA-aware dung beetle optimization-based multi-branch integration network for RNA-protein binding sites prediction

Affiliations

RMDNet: RNA-aware dung beetle optimization-based multi-branch integration network for RNA-protein binding sites prediction

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources