Random forests for genomic data analysis
- PMID: 22546560
- PMCID: PMC3387489
- DOI: 10.1016/j.ygeno.2012.04.003
Random forests for genomic data analysis
Abstract
Random forests (RF) is a popular tree-based ensemble machine learning tool that is highly data adaptive, applies to "large p, small n" problems, and is able to account for correlation as well as interactions among features. This makes RF particularly appealing for high-dimensional genomic data analysis. In this article, we systematically review the applications and recent progresses of RF for genomic data, including prediction and classification, variable selection, pathway analysis, genetic association and epistasis detection, and unsupervised learning.
Copyright © 2012 Elsevier Inc. All rights reserved.
References
-
- Breiman L. Random forests. Machine Learning. 2001;45(1):5–32.
-
- Breiman L. Bagging predictors. Machine Learning. 1996;24(2):123–140.
-
- Ishwaran H, Kogalur UB, Gorodeski EZ, Minn AJ, Lauer MS. High-dimensional variable selection for survival data. Journal of the American Statistical Association. 2010;105(489):205–217.
-
- Breiman L, Friedman JH, Olshen R, Stone C. Classification and regression trees. Belmont, Calif.: Wadsworth; 1984.
-
- Biau G, Devroye L, Lugosi G. Consistency of random forests and other averaging classifiers. Journal of Machine Learning Research. 2008;9:2015–2033.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
