Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set
- PMID: 19697302
- PMCID: PMC3744197
- DOI: 10.1002/sim.3707
Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set
Abstract
Many investigators conducting translational research are performing high-throughput genomic experiments and then developing multigenic classifiers using the resulting high-dimensional data set. In a large number of applications, the class to be predicted may be inherently ordinal. Examples of ordinal outcomes include tumor-node-metastasis (TNM) stage (I, II, III, IV); drug toxicity evaluated as none, mild, moderate, or severe; and response to treatment classified as complete response, partial response, stable disease, or progressive disease. While one can apply nominal response classification methods to ordinal response data, in doing so some information is lost that may improve the predictive performance of the classifier. This study examined the effectiveness of alternative ordinal splitting functions combined with bootstrap aggregation for classifying an ordinal response. We demonstrate that the ordinal impurity and ordered twoing methods have desirable properties for classifying ordinal response data and both perform well in comparison to other previously described methods. Developing a multigenic classifier is a common goal for microarray studies, and therefore application of the ordinal ensemble methods is demonstrated on a high-throughput methylation data set.
Figures



References
-
- Breiman L. Heuristics of instability and stabilization in model selection. Annals of Statistics. 1996;24(6):2350–2383.
-
- Dietterich TG. Machine learning research: Four current directions. AI Magazine. 1997;18:97–136.
-
- Breiman L. Bagging predictors. Machine Learning. 1996;24(2):123–140.
-
- Breim GJ, Benediktsson JA, Sveinsson JR. Proceedings of the International Workshop on Multiple Classifier Systems. Springer; New York: 2001. Boosting, bagging, and consensus based classification of multisource remote sensing data; pp. 279–288.
-
- Hothorn T, Lausen B. Bagging tree classifiers for laser scanning images: a data- and simulation-based strategy. Artificial Intelligence in Medicine. 2003;27:65–79. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources