Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set

K J Archer¹, V R Mas

Affiliations

PMID: 19697302
PMCID: PMC3744197
DOI: 10.1002/sim.3707

Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set

K J Archer et al. Stat Med. 2009.

. 2009 Dec 20;28(29):3597-610.

doi: 10.1002/sim.3707.

Authors

K J Archer¹, V R Mas

Affiliation

¹ Department of Biostatistics, Virginia Commonwealth University, Richmond, VA 23298-0032, USA. kjarcher@vcu.edu

PMID: 19697302
PMCID: PMC3744197
DOI: 10.1002/sim.3707

Abstract

Many investigators conducting translational research are performing high-throughput genomic experiments and then developing multigenic classifiers using the resulting high-dimensional data set. In a large number of applications, the class to be predicted may be inherently ordinal. Examples of ordinal outcomes include tumor-node-metastasis (TNM) stage (I, II, III, IV); drug toxicity evaluated as none, mild, moderate, or severe; and response to treatment classified as complete response, partial response, stable disease, or progressive disease. While one can apply nominal response classification methods to ordinal response data, in doing so some information is lost that may improve the predictive performance of the classifier. This study examined the effectiveness of alternative ordinal splitting functions combined with bootstrap aggregation for classifying an ordinal response. We demonstrate that the ordinal impurity and ordered twoing methods have desirable properties for classifying ordinal response data and both perform well in comparison to other previously described methods. Developing a multigenic classifier is a common goal for microarray studies, and therefore application of the ordinal ensemble methods is demonstrated on a high-throughput methylation data set.

PubMed Disclaimer

Figures

**Figure 1**
Candidate split s divides node t it into left (*t_L*) and right (*t_R*) descendant nodes such that a proportion *p_L* of the cases in t go into *t_L* and a proportion of cases *p_R* go into *t_R*.

**Figure 2**
Right-hand figure: Boxplots of misclassification rates for each of the five impurity methods calculated using out-of-bag observations from the simulation study. Left-hand figure: Boxplots of the gamma statistic for each of the five impurity methods calculated using out-of-bag observations from the simulation study.

**Figure 3**
Right-hand figure: Boxplots of misclassification rates for each of the four impurity methods calculated using out-of-bag observations from the case application. Left-hand figure: Boxplots of the gamma statistic for each of the four impurity methods calculated using out-of-bag observations from the case application.

See this image and copyright information in PMC

References

1. Breiman L. Heuristics of instability and stabilization in model selection. Annals of Statistics. 1996;24(6):2350–2383.
1. Dietterich TG. Machine learning research: Four current directions. AI Magazine. 1997;18:97–136.
1. Breiman L. Bagging predictors. Machine Learning. 1996;24(2):123–140.
1. Breim GJ, Benediktsson JA, Sveinsson JR. Proceedings of the International Workshop on Multiple Classifier Systems. Springer; New York: 2001. Boosting, bagging, and consensus based classification of multisource remote sensing data; pp. 279–288.
1. Hothorn T, Lausen B. Bagging tree classifiers for laser scanning images: a data- and simulation-based strategy. Artificial Intelligence in Medicine. 2003;27:65–79. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set

Affiliation

Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources