Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1998:523-7.

Improving machine learning performance by removing redundant cases in medical data sets

Affiliations

Improving machine learning performance by removing redundant cases in medical data sets

L Ohno-Machado et al. Proc AMIA Symp. 1998.

Abstract

Neural network models and other machine learning methods have successfully been applied to several medical classification problems. These models can be periodically refined and retrained as new cases become available. Since training neural networks by backpropagation is time consuming, it is desirable that a minimum number of representative cases be kept in the training set (i.e., redundant cases should be removed). The removal of redundant cases should be carefully monitored so that classification performance is not significantly affected. We made experiments on data removal on a data set of 700 patients suspected of having myocardial infarction and show that there is no statistical difference in classification performance (measured by the differences in areas under the ROC curve on two previously unknown sets of 553 and 500 cases) when as many as 86% of the cases are randomly removed. A proportional reduction in the amount of time required to train the neural network model is achieved.

PubMed Disclaimer

References

    1. Radiology. 1982 Apr;143(1):29-36 - PubMed
    1. Eur Heart J. 1996 Aug;17(8):1181-91 - PubMed

Publication types

LinkOut - more resources