Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers

H P Chan¹, B Sahiner, R F Wagner, N Petrick

Affiliations

PMID: 10619251
DOI: 10.1118/1.598805

Free article

Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers

H P Chan et al. Med Phys. 1999 Dec.

Free article

. 1999 Dec;26(12):2654-68.

doi: 10.1118/1.598805.

Authors

H P Chan¹, B Sahiner, R F Wagner, N Petrick

Affiliation

¹ Department of Radiology, University of Michigan, Ann Arbor 48109-0030, USA. chanhp@umich.edu

PMID: 10619251
DOI: 10.1118/1.598805

Abstract

Classifier design is one of the key steps in the development of computer-aided diagnosis (CAD) algorithms. A classifier is designed with case samples drawn from the patient population. Generally, the sample size available for classifier design is limited, which introduces variance and bias into the performance of the trained classifier, relative to that obtained with an infinite sample size. For CAD applications, a commonly used performance index for a classifier is the area, Az, under the receiver operating characteristic (ROC) curve. We have conducted a computer simulation study to investigate the dependence of the mean performance, in terms of Az, on design sample size for a linear discriminant and two nonlinear classifiers, the quadratic discriminant and the backpropagation neural network (ANN). The performances of the classifiers were compared for four types of class distributions that have specific properties: multivariate normal distributions with equal covariance matrices and unequal means, unequal covariance matrices and unequal means, and unequal covariance matrices and equal means, and a feature space where the two classes were uniformly distributed in disjoint checkerboard regions. We evaluated the performances of the classifiers in feature spaces of dimensionality ranging from 3 to 15, and design sample sizes from 20 to 800 per class. The dependence of the resubstitution and hold-out performance on design (training) sample size (Nt) was investigated. For multivariate normal class distributions with equal covariance matrices, the linear discriminant is the optimal classifier. It was found that its Az-versus-1/Nt curves can be closely approximated by linear dependences over the range of sample sizes studied. In the feature spaces with unequal covariance matrices where the quadratic discriminant is optimal, the linear discriminant is inferior to the quadratic discriminant or the ANN when the design sample size is large. However, when the design sample is small, a relatively simple classifier, such as the linear discriminant or an ANN with very few hidden nodes, may be preferred because performance bias increases with the complexity of the classifier. In the regime where the classifier performance is dominated by the 1/Nt term, the performance in the limit of infinite sample size can be estimated as the intercept (1/Nt= 0) of a linear regression of Az versus 1/Nt. The understanding of the performance of the classifiers under the constraint of a finite design sample size is expected to facilitate the selection of a proper classifier for a given classification task and the design of an efficient resampling scheme.

PubMed Disclaimer

Cited by

Evaluation of data augmentation via synthetic images for improved breast mass detection on mammograms using deep learning.
Cha KH, Petrick N, Pezeshk A, Graff CG, Sharma D, Badal A, Sahiner B. Cha KH, et al. J Med Imaging (Bellingham). 2020 Jan;7(1):012703. doi: 10.1117/1.JMI.7.1.012703. Epub 2019 Nov 22. J Med Imaging (Bellingham). 2020. PMID: 31763356 Free PMC article.
Potential of computer-aided diagnosis of high spectral and spatial resolution (HiSS) MRI in the classification of breast lesions.
Bhooshan N, Giger M, Medved M, Li H, Wood A, Yuan Y, Lan L, Marquez A, Karczmar G, Newstead G. Bhooshan N, et al. J Magn Reson Imaging. 2014 Jan;39(1):59-67. doi: 10.1002/jmri.24145. Epub 2013 Sep 10. J Magn Reson Imaging. 2014. PMID: 24023011 Free PMC article.
Bladder Cancer Treatment Response Assessment in CT using Radiomics with Deep-Learning.
Cha KH, Hadjiiski L, Chan HP, Weizer AZ, Alva A, Cohan RH, Caoili EM, Paramagul C, Samala RK. Cha KH, et al. Sci Rep. 2017 Aug 18;7(1):8738. doi: 10.1038/s41598-017-09315-w. Sci Rep. 2017. PMID: 28821822 Free PMC article.
Dynamic multiple thresholding breast boundary detection algorithm for mammograms.
Wu YT, Zhou C, Chan HP, Paramagul C, Hadjiiski LM, Daly CP, Douglas JA, Zhang Y, Sahiner B, Shi J, Wei J. Wu YT, et al. Med Phys. 2010 Jan;37(1):391-401. doi: 10.1118/1.3273062. Med Phys. 2010. PMID: 20175501 Free PMC article.
Noise injection for training artificial neural networks: a comparison with weight decay and early stopping.
Zur RM, Jiang Y, Pesce LL, Drukker K. Zur RM, et al. Med Phys. 2009 Oct;36(10):4810-8. doi: 10.1118/1.3213517. Med Phys. 2009. PMID: 19928111 Free PMC article.

See all "Cited by" articles

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

CA 48129/CA/NCI NIH HHS/United States

LinkOut - more resources

Full Text Sources
- MLibrary (Deep Blue) - Access Free Full Text
- Wiley
Other Literature Sources
- The Lens - Patent Citations Database
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers

Affiliation

Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers

Authors

Affiliation

Abstract

Similar articles

Cited by

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Miscellaneous