Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 11;26(1):176.
doi: 10.1186/s12859-025-06197-y.

RMDNet: RNA-aware dung beetle optimization-based multi-branch integration network for RNA-protein binding sites prediction

Affiliations

RMDNet: RNA-aware dung beetle optimization-based multi-branch integration network for RNA-protein binding sites prediction

Jiangbo Zhang et al. BMC Bioinformatics. .

Abstract

RNA-binding proteins (RBPs) play crucial roles in gene regulation. Their dysregulation has been increasingly linked to neurodegenerative diseases, liver cancer, and lung cancer. Although experimental methods like CLIP-seq accurately identify RNA-protein binding sites, they are time-consuming and costly. To address this, we propose RMDNet-a deep learning framework that integrates CNN, CNN-Transformer, and ResNet branches to capture features at multiple sequence scales. These features are fused with structural representations derived from RNA secondary structure graphs. The graphs are processed using a graph neural network with DiffPool. To optimize feature integration, we incorporate an improved dung beetle optimization algorithm, which adaptively assigns fusion weights during inference. Evaluations on the RBP-24 benchmark show that RMDNet outperforms state-of-the-art models including GraphProt, DeepRKE, and DeepDW across multiple metrics. On the RBP-31 dataset, it demonstrates strong generalization ability, while ablation studies on RBPsuite2.0 validate the contributions of individual modules. We assess biological interpretability by extracting candidate binding motifs from the first-layer CNN kernels. Several motifs closely match experimentally validated RBP motifs, confirming the model's capacity to learn biologically meaningful patterns. A downstream case study on YTHDF1 focuses on analyzing interpretable spatial binding patterns, using a large-scale prediction dataset and CLIP-seq peak alignment. The results confirm that the model captures localized binding signals and spatial consistency with experimental annotations. Overall, RMDNet is a robust and interpretable tool for predicting RNA-protein binding sites. It has broad potential in disease mechanism research and therapeutic target discovery. The source code is available https://github.com/cskyan/RMDNet .

Keywords: Convolutional neural network; Dung beetle optimizer; Feature fusion strategy; Graph neural network; Multi-branch deep learning network; RNA–protein binding sites.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Overall architecture of the proposed RMDNet framework. a Data preparation: RNA sequences are encoded using one-hot vectors and segmented by a sliding window strategy. b Secondary structure extraction: Each sequence is converted into a structural graph using RNAfold, and topology features are extracted via a GNN with DiffPool. c Feature fusion: Sequence features are learned from three parallel classifiers (CNN, CNN-Transformer, and ResNet), and are concatenated with structure features in the fully connected layer. d Weight assignment: During inference, the IDBO algorithm is applied to optimize fusion weights across branches by searching for the best combination
Algorithm 1
Algorithm 1
Inference Phase with IDBO-Based Weighted Fusion
Fig. 2
Fig. 2
Comparison of average classification metrics among different models on the RBP-24 dataset. The bar chart includes eight standard metrics: AUC, PR-AUC, Accuracy, Precision, Recall, Specificity, F1-score, and MCC
Fig. 3
Fig. 3
Distribution of AUC and MCC scores across different models on the RBP-24 dataset. Each boxplot summarizes the per-RBP performance for one evaluation metric, reflecting the stability and generalization ability of each model
Fig. 4
Fig. 4
Comparison of AUC scores among different models on the RBP-31 dataset. The bar chart shows the AUC performance of four models—GraphProt, DeepRKE, DeepDW, and RMDNet—across 31 RNA-binding proteins
Fig. 5
Fig. 5
Comparison of AUC, F1-score, and MCC scores across different ablation variants of RMDNet on CPSF1 and FBL datasets. Each bar represents the average performance under one evaluation metric, illustrating the contribution of individual components and the robustness of the full model
Fig. 6
Fig. 6
Motifs learned by the CNN branch for YTHDF1. Sixteen sequence logos represent the binding patterns captured by each convolutional kernel
Fig. 7
Fig. 7
Reliable motifs with significant matches to known RBP patterns. Nine sequence logos correspond to CNN kernels whose motifs aligned with entries in the CISBP-RNA database (E-value < 0.05)
Fig. 8
Fig. 8
Visualization of binding site distribution and prediction score heatmap for YTHDF1 on real RNA sequences. The histogram shows the frequency of high-scoring binding positions (score formula image) across normalized RNA regions, while the heatmap aggregates 10,000 predicted sequences into 500 groups, highlighting consistent positional patterns. These visualizations reflect the model’s spatial sensitivity and downstream interpretability

Similar articles

References

    1. Soleymani F, Paquet E, Viktor H, Michalowski W, Spinello D. Protein–protein interaction prediction with deep learning: a comprehensive review. Comput Struct Biotechnol J. 2022;20:5316–41. 10.1016/j.csbj.2022.09.029. - PMC - PubMed
    1. Zuo Y, Chen H, Yang L, Chen R, Zhang X, Deng Z. Research progress on prediction of RNA–protein binding sites in the past five years. Anal Biochem. 2024;691: 115535. 10.1016/j.ab.2023.115535. - PubMed
    1. Shi X, Nordenskiöld L, Sokolova OS, Shaytan AK. Editorial: recent advances in molecular properties of DNA–protein interactions, chromatin and their biological roles. Front Mol Biosci. 2023;10:1–3. 10.3389/fmolb.2023.1234567. - PMC - PubMed
    1. Lee M. Recent advances in deep learning for protein–protein interaction analysis: a comprehensive review. Molecules. 2023;28:1234. 10.3390/molecules28131234. - PMC - PubMed
    1. Dhanuka R, Singh JP, Tripathi A. A comprehensive survey of deep learning techniques in protein function prediction. IEEE/ACM Trans Comput Biol Bioinform. 2023;20:2291–301. 10.1109/TCBB.2023.3245678. - PubMed

LinkOut - more resources