. 2006 Feb 23:7:91.

doi: 10.1186/1471-2105-7-91.

Bias in error estimation when using cross-validation for model selection

Sudhir Varma¹, Richard Simon

Affiliations

PMID: 16504092
PMCID: PMC1397873
DOI: 10.1186/1471-2105-7-91

Bias in error estimation when using cross-validation for model selection

Sudhir Varma et al. BMC Bioinformatics. 2006.

. 2006 Feb 23:7:91.

doi: 10.1186/1471-2105-7-91.

Authors

Sudhir Varma¹, Richard Simon

Affiliation

¹ Biometric Research Branch, National Cancer Institute, Bethesda, MD, USA. varmas@mail.nih.gov

PMID: 16504092
PMCID: PMC1397873
DOI: 10.1186/1471-2105-7-91

Abstract

Background: Cross-validation (CV) is an effective method for estimating the prediction error of a classifier. Some recent articles have proposed methods for optimizing classifiers by choosing classifier parameter values that minimize the CV error estimate. We have evaluated the validity of using the CV error estimate of the optimized classifier as an estimate of the true error expected on independent data.

Results: We used CV to optimize the classification parameters for two kinds of classifiers; Shrunken Centroids and Support Vector Machines (SVM). Random training datasets were created, with no difference in the distribution of the features between the two classes. Using these "null" datasets, we selected classifier parameter values that minimized the CV error estimate. 10-fold CV was used for Shrunken Centroids while Leave-One-Out-CV (LOOCV) was used for the SVM. Independent test data was created to estimate the true error. With "null" and "non null" (with differential expression between the classes) data, we also tested a nested CV procedure, where an inner CV loop is used to perform the tuning of the parameters while an outer CV is used to compute an estimate of the error. The CV error estimate for the classifier with the optimal parameters was found to be a substantially biased estimate of the true error that the classifier would incur on independent data. Even though there is no real difference between the two classes for the "null" datasets, the CV error estimate for the Shrunken Centroid with the optimal parameters was less than 30% on 18.5% of simulated training data-sets. For SVM with optimal parameters the estimated error rate was less than 30% on 38% of "null" data-sets. Performance of the optimized classifiers on the independent test set was no better than chance. The nested CV procedure reduces the bias considerably and gives an estimate of the error that is very close to that obtained on the independent testing set for both Shrunken Centroids and SVM classifiers for "null" and "non-null" data distributions.

Conclusion: We show that using CV to compute an error estimate for a classifier that has itself been tuned using CV gives a significantly biased estimate of the true error. Proper use of CV for estimating true error of a classifier developed using a well defined algorithm requires that all steps of the algorithm, including classifier parameter tuning, be repeated in each CV loop. A nested CV procedure provides an almost unbiased estimate of the true error.

PubMed Disclaimer

Figures

**Figure 1**
Distribution of the CV error estimate and the true error for optimized Shrunken Centroids.

**Figure 2**
Distribution of the CV error estimate and the true error for optimized Support Vector Machine.

**Figure 3**
Distribution of the nested CV error estimate and true error for optimized Shrunken Centroids nested within a LOOCV loop.

**Figure 4**
Distribution of the nested CV error estimate and true error for optimized SVM nested within a LOOCV loop.

See this image and copyright information in PMC

Cited by

Activating and relaxing music entrains the speed of beat synchronized walking.
Leman M, Moelants D, Varewyck M, Styns F, van Noorden L, Martens JP. Leman M, et al. PLoS One. 2013 Jul 10;8(7):e67932. doi: 10.1371/journal.pone.0067932. Print 2013. PLoS One. 2013. PMID: 23874469 Free PMC article. Clinical Trial.
Utilization of machine learning to test the impact of cognitive processing and emotion recognition on the development of PTSD following trauma exposure.
Augsburger M, Galatzer-Levy IR. Augsburger M, et al. BMC Psychiatry. 2020 Jun 23;20(1):325. doi: 10.1186/s12888-020-02728-4. BMC Psychiatry. 2020. PMID: 32576245 Free PMC article.
Using machine learning with intensive longitudinal data to predict depression and suicidal ideation among medical interns over time.
Horwitz AG, Kentopp SD, Cleary J, Ross K, Wu Z, Sen S, Czyz EK. Horwitz AG, et al. Psychol Med. 2023 Sep;53(12):5778-5785. doi: 10.1017/S0033291722003014. Epub 2022 Sep 30. Psychol Med. 2023. PMID: 36177889 Free PMC article.
Cross-validation of predictive models for functional recovery after post-stroke rehabilitation.
Campagnini S, Liuzzi P, Mannini A, Basagni B, Macchi C, Carrozza MC, Cecchi F. Campagnini S, et al. J Neuroeng Rehabil. 2022 Sep 7;19(1):96. doi: 10.1186/s12984-022-01075-7. J Neuroeng Rehabil. 2022. PMID: 36071452 Free PMC article.
Genomic biomarkers for personalized medicine: development and validation in clinical studies.
Matsui S. Matsui S. Comput Math Methods Med. 2013;2013:865980. doi: 10.1155/2013/865980. Epub 2013 Apr 17. Comput Math Methods Med. 2013. PMID: 23690882 Free PMC article. Review.

See all "Cited by" articles

References

1. Duda RO, Hart PE, Stork DG. Pattern classification. John Wiley and Sons Inc. 2001;Ch.9:483–486.
1. Simon R, Radmacher MD, Dobbin K, McShane LM. Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst. 2003;95:14–18. - PubMed
1. Ambroise C, McLachlan GJ. Selection bias in gene extraction on the basis of microarray gene-expression data. PNAS. 99:6562–6566. doi: 10.1073/pnas.102102699. 2002 May 14 . - DOI - PMC - PubMed
1. Reunanen J. Overfitting in making comparisons between variable selection methods. J Machine Learning Research. 2003;3:1371–1382. doi: 10.1162/153244303322753715. - DOI
1. Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. PNAS. 99:6567–6572. doi: 10.1073/pnas.082099299. 2002 May 14. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- ClinicalTrials.gov

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Bias in error estimation when using cross-validation for model selection

Affiliation

Bias in error estimation when using cross-validation for model selection

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical