Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 11;14(1):27503.
doi: 10.1038/s41598-024-78212-w.

Prediction of miRNA-disease association based on multisource inductive matrix completion

Affiliations

Prediction of miRNA-disease association based on multisource inductive matrix completion

YaWei Wang et al. Sci Rep. .

Abstract

MicroRNAs (miRNAs) are endogenous non-coding RNAs approximately 23 nucleotides in length, playing significant roles in various cellular processes. Numerous studies have shown that miRNAs are involved in the regulation of many human diseases. Accurate prediction of miRNA-disease associations is crucial for early diagnosis, treatment, and prognosis assessment of diseases. In this paper, we propose the Autoencoder Inductive Matrix Completion (AEIMC) model to identify potential miRNA-disease associations. The model captures interaction features from multiple similarity networks, including miRNA functional similarity, miRNA sequence similarity, disease semantic similarity, disease ontology similarity, and Gaussian interaction kernel similarity between miRNAs and diseases. Autoencoders are used to extract more complex and abstract data representations, which are then input into the inductive matrix completion model for association prediction. The effectiveness of the model is validated through cross-validation, stratified threshold evaluation, and case studies, while ablation experiments further confirm the necessity of introducing sequence and ontology similarities for the first time.

Keywords: Ablation experiment; Autoencoder; Inductive matrix completion; Multi-source information; Optimization algorithm; miRNA-disease association.

PubMed Disclaimer

Conflict of interest statement

Competing interests The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Model frame diagram. Firstly, three individual similarity matrices for miRNA and diseases were prepared in the respective databases. Subsequently, a comprehensive similarity matrix for diseases and miRNA was generated. Following this, a low-dimensional representation of the comprehensive similarity matrix was learned through an autoencoder. Finally, the obtained low-dimensional representation was input into an Inductive Matrix Completion (IMC) model to derive the ultimate prediction matrix.
Fig. 2
Fig. 2
Roc of AEIMC. The ROC curves and AUC values for each fold based on the AEIMC model, with each fold achieving an AUC value above 0.91, and an average AUC of 0.92.
Fig. 3
Fig. 3
AUC differences between AEIMC and five comparison models (IMCMDA, SIMCCDA, NIMCGCN, PDMDA, and LAGCN) based on the Bootstrap method.The distributions demonstrate that all AUC differences are significantly positive, as none of the confidence intervals cross zero, indicating that AEIMC performs significantly better than all comparison models at the 0.05 significance level.
Fig. 4
Fig. 4
Roc fusing the two similarity dimensions. Based on the ROC curves and corresponding AUC values for each fold, it is evident that without considering miRNA sequence similarity and disease ontology similarity, there is a noticeable decrease in the accuracy and effectiveness of the model. This validates the rationale for considering additional similarity dimensions.
Fig. 5
Fig. 5
Heatmaps of Model Performance under different numbers of known associations and Top-k thresholds. (a) Precision for different percentages of known associations, (b) Recall for different percentages of known associations, and (c) F1 for different percentages of known associations. The color variations in the heatmaps show that as the number of known associations increases, the Precision, Recall, and F1 scores gradually improve. Additionally, lowering the Top-k threshold further enhances model performance.
Fig. 6
Fig. 6
Validation of prediction results for three types of cancer. This figure shows the validation of the top 50 miRNA prediction results for breast cancer, lung cancer (non-small cell lung carcinoma), and gastric cancer. The blue sections represent correct predictions validated by the HMDD v4.0 database, while the red sections represent incorrect predictions that were not validated.
Fig. 7
Fig. 7
The DAG of bone neoplasms. The DAG illustrates two distinct paths in Bone Neoplasms, one originating from C04 and the other from C05. The first path involves successive nodes C04.588 and C04.588.149, while the second path follows nodes C05, C05.116, and C05.116.231.
Fig. 8
Fig. 8
Schematic diagram of the principle of automatic encoder. The schematic diagram illustrates the encoding and decoding segments, achieving information compression and reconstruction through neurons and interconnections.
Fig. 9
Fig. 9
The pseudocode corresponding to the model training. It demonstrates the key steps in constructing the model and predicting miRNA-disease associations.

Similar articles

References

    1. Taguchi, Y.-H. Inference of target gene regulation via miRNAs during cell senescence by using the MiRaGE server, In International Conference on Intelligent Coumputing, Springer, 441–446. 10.1007/978-3-642-31837-5_64(2012). - PMC - PubMed
    1. Hua, S., Yun, W., Zhiqiang, Z. & Zou, Q. A discussion of micrornas in cancers. Curr. Bioinform.9, 453–462. 10.2174/1574893609666140804221135 (2014).
    1. Lynam-Lennon, N., Maher, S. G. & Reynolds, J. V. The roles of microRNA in cancer and apoptosis. Biol. Rev.84, 55–71. 10.1111/j.1469-185X.2008.00061.x (2009). - PubMed
    1. Chen, X. et al. Long non-coding RNAs and complex disease: from experimental results to computational models. Brief. Bioinform.18, 558–576. 10.1093/bib/bbw060 (2017). - PMC - PubMed
    1. Chen, X. et al. NRDTD: a database for clinically or experimentally supported non-coding RNAs and drug targets associations. Database (Oxford), 1–6, 10.1093/database/bax057 (2017). - PMC - PubMed

LinkOut - more resources