Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul:45-48.
doi: 10.1109/PRNI.2012.27.

Parameter Selection in Mutual Information-Based Feature Selection in Automated Diagnosis of Multiple Epilepsies Using Scalp EEG

Affiliations

Parameter Selection in Mutual Information-Based Feature Selection in Automated Diagnosis of Multiple Epilepsies Using Scalp EEG

Wesley T Kerr et al. Int Workshop Pattern Recognit Neuroimaging. 2012 Jul.

Abstract

Developing EEG-based computer aided diagnostic (CAD) tools would allow identification of epilepsy in individuals who have experienced possible seizures, yet such an algorithm requires efficient identification of meaningful features out of potentially more than 35,000 features of EEG activity. Mutual information can be used to identify a subset of minimally-redundant and maximally relevant (mRMR) features but requires a priori selection of two parameters: the number of features of interest and the number of quantization levels into which the continuous features are binned. Here we characterize the variance of cross-validation accuracy with respect to changes in these parameters for four classes of machine learning (ML) algorithms. This assesses the efficiency of combining mRMR with each of these algorithms by assessing when the variance of cross-validation accuracy is minimized and demonstrates how naive parameter selection may artificially depress accuracy. Our results can be used to improve the understanding of how feature selection interacts with four classes of ML algorithms and provide guidance for better a priori parameter selection in situations where an overwhelming number of redundant, noisy features are available for classification.

Keywords: automated diagnosis; epilepsy; feature selection; mutual information; scalp EEG.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The cross validation accuracy of all four classifiers. The unsampled points are filled using Akima bivariate interpolation [6]. The bottom right corner is set to 0 due to lack of support. All values less than 40% are rounded up to 40% to maintain contrast. Without multiple testing correction, individual yellow to red points are significantly more accurate than a naive classifier whereas deep blue points are significantly worse.
Figure 2
Figure 2
Variance of cross validation accuracy with respect to number of input features. Thickness represents standard error.
Figure 3
Figure 3
Variance of cross validation accuracy with respect to number of quantal levels. Thickness represents standard error.

References

    1. Peng H, Long F, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Analysis and Machine Intelligence. 2005;27(8):1226–38. - PubMed
    1. Hall P, Morton SC. On the estimation of entropy. Ann Inst Statist Math. 1993;45(1):69–88.
    1. Bouckaert RR, Frank E, Hall MA, Holmes G, Pfahringer B, Ruetemann P, Witten IH. Weka-experiences with a java open-source project. J Mach Learn Res. 2010;11:2533–41.
    1. Akima H. Algorithm 761: scattered-data surface fitting that has the accuracy of a cubic polynomial. ACM Transactions on Mathematical Software. 1996;22:362–71.
    1. Cleveland WS, Grosse E, Shyu WM. Chapter 8:Local regression models. In: Chambers JM, Hastie TJ, editors. Statistical Models. S: Wadsworth & Brooks/Cole; 1992.

LinkOut - more resources