Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 20;28(29):3597-610.
doi: 10.1002/sim.3707.

Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set

Affiliations

Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set

K J Archer et al. Stat Med. .

Abstract

Many investigators conducting translational research are performing high-throughput genomic experiments and then developing multigenic classifiers using the resulting high-dimensional data set. In a large number of applications, the class to be predicted may be inherently ordinal. Examples of ordinal outcomes include tumor-node-metastasis (TNM) stage (I, II, III, IV); drug toxicity evaluated as none, mild, moderate, or severe; and response to treatment classified as complete response, partial response, stable disease, or progressive disease. While one can apply nominal response classification methods to ordinal response data, in doing so some information is lost that may improve the predictive performance of the classifier. This study examined the effectiveness of alternative ordinal splitting functions combined with bootstrap aggregation for classifying an ordinal response. We demonstrate that the ordinal impurity and ordered twoing methods have desirable properties for classifying ordinal response data and both perform well in comparison to other previously described methods. Developing a multigenic classifier is a common goal for microarray studies, and therefore application of the ordinal ensemble methods is demonstrated on a high-throughput methylation data set.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Candidate split s divides node t it into left (tL) and right (tR) descendant nodes such that a proportion pL of the cases in t go into tL and a proportion of cases pR go into tR.
Figure 2
Figure 2
Right-hand figure: Boxplots of misclassification rates for each of the five impurity methods calculated using out-of-bag observations from the simulation study. Left-hand figure: Boxplots of the gamma statistic for each of the five impurity methods calculated using out-of-bag observations from the simulation study.
Figure 3
Figure 3
Right-hand figure: Boxplots of misclassification rates for each of the four impurity methods calculated using out-of-bag observations from the case application. Left-hand figure: Boxplots of the gamma statistic for each of the four impurity methods calculated using out-of-bag observations from the case application.

References

    1. Breiman L. Heuristics of instability and stabilization in model selection. Annals of Statistics. 1996;24(6):2350–2383.
    1. Dietterich TG. Machine learning research: Four current directions. AI Magazine. 1997;18:97–136.
    1. Breiman L. Bagging predictors. Machine Learning. 1996;24(2):123–140.
    1. Breim GJ, Benediktsson JA, Sveinsson JR. Proceedings of the International Workshop on Multiple Classifier Systems. Springer; New York: 2001. Boosting, bagging, and consensus based classification of multisource remote sensing data; pp. 279–288.
    1. Hothorn T, Lausen B. Bagging tree classifiers for laser scanning images: a data- and simulation-based strategy. Artificial Intelligence in Medicine. 2003;27:65–79. - PubMed

Publication types

MeSH terms

LinkOut - more resources