Supervised learning with decision tree-based methods in computational and systems biology
- PMID: 20023720
- DOI: 10.1039/b907946g
Supervised learning with decision tree-based methods in computational and systems biology
Abstract
At the intersection between artificial intelligence and statistics, supervised learning allows algorithms to automatically build predictive models from just observations of a system. During the last twenty years, supervised learning has been a tool of choice to analyze the always increasing and complexifying data generated in the context of molecular biology, with successful applications in genome annotation, function prediction, or biomarker discovery. Among supervised learning methods, decision tree-based methods stand out as non parametric methods that have the unique feature of combining interpretability, efficiency, and, when used in ensembles of trees, excellent accuracy. The goal of this paper is to provide an accessible and comprehensive introduction to this class of methods. The first part of the review is devoted to an intuitive but complete description of decision tree-based methods and a discussion of their strengths and limitations with respect to other supervised learning methods. The second part of the review provides a survey of their applications in the context of computational and systems biology.
Similar articles
-
Accuracy-based learning classifier systems: models, analysis and applications to classification tasks.Evol Comput. 2003 Fall;11(3):209-38. doi: 10.1162/106365603322365289. Evol Comput. 2003. PMID: 14558911
-
Statistical geometry based prediction of nonsynonymous SNP functional effects using random forest and neuro-fuzzy classifiers.Proteins. 2008 Jun;71(4):1930-9. doi: 10.1002/prot.21838. Proteins. 2008. PMID: 18186470
-
Neural networks.Methods Mol Biol. 2010;609:197-222. doi: 10.1007/978-1-60327-241-4_12. Methods Mol Biol. 2010. PMID: 20221921
-
Protein function prediction with high-throughput data.Amino Acids. 2008 Oct;35(3):517-30. doi: 10.1007/s00726-008-0077-y. Epub 2008 Apr 22. Amino Acids. 2008. PMID: 18427717 Review.
-
Computational intelligence approaches for pattern discovery in biological systems.Brief Bioinform. 2008 Jul;9(4):307-16. doi: 10.1093/bib/bbn021. Epub 2008 May 5. Brief Bioinform. 2008. PMID: 18460474 Review.
Cited by
-
Machine learning-based radiomics for histological classification of parotid tumors using morphological MRI: a comparative study.Eur Radiol. 2022 Dec;32(12):8099-8110. doi: 10.1007/s00330-022-08943-9. Epub 2022 Jun 24. Eur Radiol. 2022. PMID: 35748897 Clinical Trial.
-
A gradient boosting machine learning approach in modeling the impact of temperature and humidity on the transmission rate of COVID-19 in India.Appl Intell (Dordr). 2021;51(5):2727-2739. doi: 10.1007/s10489-020-01997-6. Epub 2020 Nov 4. Appl Intell (Dordr). 2021. PMID: 34764559 Free PMC article.
-
Saving time maintaining reliability: a new method for quantification of Tetranychus urticae damage in Arabidopsis whole rosettes.BMC Plant Biol. 2020 Aug 27;20(1):397. doi: 10.1186/s12870-020-02584-0. BMC Plant Biol. 2020. PMID: 32854637 Free PMC article.
-
Image-Based Monitoring of Cracks: Effectiveness Analysis of an Open-Source Machine Learning-Assisted Procedure.J Imaging. 2022 Jan 23;8(2):22. doi: 10.3390/jimaging8020022. J Imaging. 2022. PMID: 35200725 Free PMC article.
-
Automated multiparametric localization of prostate cancer based on B-mode, shear-wave elastography, and contrast-enhanced ultrasound radiomics.Eur Radiol. 2020 Feb;30(2):806-815. doi: 10.1007/s00330-019-06436-w. Epub 2019 Oct 10. Eur Radiol. 2020. PMID: 31602512 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources