A protocol for building and evaluating predictors of disease state based on microarray data

Lodewyk F A Wessels¹, Marcel J T Reinders, Augustinus A M Hart, Cor J Veenman, Hongyue Dai, Yudong D He, Laura J van't Veer

Affiliations

Affiliation

¹ Department of Mediamatics, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology Mekelweg 4, 2628 CD Delft, The Netherlands. l.f.a.wessels@ewi.tudelft.nl

PMID: 15817694
DOI: 10.1093/bioinformatics/bti429

Comparative Study

A protocol for building and evaluating predictors of disease state based on microarray data

Lodewyk F A Wessels et al. Bioinformatics. 2005.

. 2005 Oct 1;21(19):3755-62.

doi: 10.1093/bioinformatics/bti429. Epub 2005 Apr 7.

Authors

Lodewyk F A Wessels¹, Marcel J T Reinders, Augustinus A M Hart, Cor J Veenman, Hongyue Dai, Yudong D He, Laura J van't Veer

Affiliation

¹ Department of Mediamatics, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology Mekelweg 4, 2628 CD Delft, The Netherlands. l.f.a.wessels@ewi.tudelft.nl

PMID: 15817694
DOI: 10.1093/bioinformatics/bti429

Abstract

Motivation: Microarray gene expression data are increasingly employed to identify sets of marker genes that accurately predict disease development and outcome in cancer. Many computational approaches have been proposed to construct such predictors. However, there is, as yet, no objective way to evaluate whether a new approach truly improves on the current state of the art. In addition no 'standard' computational approach has emerged which enables robust outcome prediction.

Results: An important contribution of this work is the description of a principled training and validation protocol, which allows objective evaluation of the complete methodology for constructing a predictor. We review the possible choices of computational approaches, with specific emphasis on predictor choice and reporter selection strategies. Employing this training-validation protocol, we evaluated different reporter selection strategies and predictors on six gene expression datasets of varying degrees of difficulty. We demonstrate that simple reporter selection strategies (forward filtering and shrunken centroids) work surprisingly well and outperform partial least squares in four of the six datasets. Similarly, simple predictors, such as the nearest mean classifier, outperform more complex classifiers. Our training-validation protocol provides a robust methodology to evaluate the performance of new computational approaches and to objectively compare outcome predictions on different datasets.

PubMed Disclaimer

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources
- Ovid Technologies, Inc.
- Silverchair Information Systems
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A protocol for building and evaluating predictors of disease state based on microarray data

Affiliation

A protocol for building and evaluating predictors of disease state based on microarray data

Authors

Affiliation

Abstract

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources