On protocols and measures for the validation of supervised methods for the inference of biological networks
- PMID: 24348517
- PMCID: PMC3848415
- DOI: 10.3389/fgene.2013.00262
On protocols and measures for the validation of supervised methods for the inference of biological networks
Abstract
Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, computational approaches for network inference have been frequently investigated in the literature. In this paper, we examine the assessment of supervised network inference. Supervised inference is based on machine learning techniques that infer the network from a training sample of known interacting and possibly non-interacting entities and additional measurement data. While these methods are very effective, their reliable validation in silico poses a challenge, since both prediction and validation need to be performed on the basis of the same partially known network. Cross-validation techniques need to be specifically adapted to classification problems on pairs of objects. We perform a critical review and assessment of protocols and measures proposed in the literature and derive specific guidelines how to best exploit and evaluate machine learning techniques for network inference. Through theoretical considerations and in silico experiments, we analyze in depth how important factors influence the outcome of performance estimation. These factors include the amount of information available for the interacting entities, the sparsity and topology of biological networks, and the lack of experimentally verified non-interacting pairs.
Keywords: ROC curves; biological network inference; cross-validation; evaluation protocols; precision-recall curves; supervised learning.
Figures










Similar articles
-
Network inference with ensembles of bi-clustering trees.BMC Bioinformatics. 2019 Oct 28;20(1):525. doi: 10.1186/s12859-019-3104-y. BMC Bioinformatics. 2019. PMID: 31660848 Free PMC article.
-
Algebraic shortcuts for leave-one-out cross-validation in supervised network inference.Brief Bioinform. 2020 Jan 17;21(1):262-271. doi: 10.1093/bib/bby095. Brief Bioinform. 2020. PMID: 30329015
-
Classifying pairs with trees for supervised biological network inference.Mol Biosyst. 2015 Aug;11(8):2116-25. doi: 10.1039/c5mb00174a. Mol Biosyst. 2015. PMID: 26008881 Free PMC article.
-
Biological Network Inference and analysis using SEBINI and CABIN.Methods Mol Biol. 2009;541:551-76. doi: 10.1007/978-1-59745-243-4_24. Methods Mol Biol. 2009. PMID: 19381531 Review.
-
A review of active learning approaches to experimental design for uncovering biological networks.PLoS Comput Biol. 2017 Jun 1;13(6):e1005466. doi: 10.1371/journal.pcbi.1005466. eCollection 2017 Jun. PLoS Comput Biol. 2017. PMID: 28570593 Free PMC article. Review.
Cited by
-
Machine Learning of Protein Interactions in Fungal Secretory Pathways.PLoS One. 2016 Jul 21;11(7):e0159302. doi: 10.1371/journal.pone.0159302. eCollection 2016. PLoS One. 2016. PMID: 27441920 Free PMC article.
-
Ranking genome-wide correlation measurements improves microarray and RNA-seq based global and targeted co-expression networks.Sci Rep. 2018 Jul 18;8(1):10885. doi: 10.1038/s41598-018-29077-3. Sci Rep. 2018. PMID: 30022075 Free PMC article.
-
Cold-Start Problems in Data-Driven Prediction of Drug-Drug Interaction Effects.Pharmaceuticals (Basel). 2021 May 2;14(5):429. doi: 10.3390/ph14050429. Pharmaceuticals (Basel). 2021. PMID: 34063324 Free PMC article.
-
Machine Learning Predicts Drug Metabolism and Bioaccumulation by Intestinal Microbiota.Pharmaceutics. 2021 Nov 25;13(12):2001. doi: 10.3390/pharmaceutics13122001. Pharmaceutics. 2021. PMID: 34959282 Free PMC article.
-
Prediction of Genetic Interactions Using Machine Learning and Network Properties.Front Bioeng Biotechnol. 2015 Oct 26;3:172. doi: 10.3389/fbioe.2015.00172. eCollection 2015. Front Bioeng Biotechnol. 2015. PMID: 26579514 Free PMC article. Review.
References
-
- Breiman L. (2001). Random forests. Mach. Learn. 45, 5–32 10.1023/A:1017934522171 - DOI
Publication types
LinkOut - more resources
Full Text Sources
Other Literature Sources