Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comment
. 2014 Jan 1:84:1107-10.
doi: 10.1016/j.neuroimage.2013.07.050. Epub 2013 Jul 25.

The utility of data-driven feature selection: re: Chu et al. 2012

Affiliations
Comment

The utility of data-driven feature selection: re: Chu et al. 2012

Wesley T Kerr et al. Neuroimage. .

Abstract

The recent Chu et al. (2012) manuscript discusses two key findings regarding feature selection (FS): (1) data driven FS was no better than using whole brain voxel data and (2) a priori biological knowledge was effective to guide FS. Use of FS is highly relevant in neuroimaging-based machine learning, as the number of attributes can greatly exceed the number of exemplars. We strongly endorse their demonstration of both of these findings, and we provide additional important practical and theoretical arguments as to why, in their case, the data-driven FS methods they implemented did not result in improved accuracy. Further, we emphasize that the data-driven FS methods they tested performed approximately as well as the all-voxel case. We discuss why a sparse model may be favored over a complex one with similar performance. We caution readers that the findings in the Chu et al. report should not be generalized to all data-driven FS methods.

Keywords: Feature selection; Machine learning; Neuroimaging.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement: The authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1
A reproduction of Chu et al.'s Fig. 9E where the added shading indicates the 95% confidence interval for the no feature selection accuracy using the normal approximation of the binomial distribution. Accuracy using all voxelized features was not significantly higher than data-driven feature selection accuracy at the optimum C, C*. At multiple non-optimum C values, the accuracy using data-driven feature selection was significantly higher than using all voxelized features.

Comment on

References

    1. Anderson A, Bramen J, Douglas PK, Lenartowicz A, Cho A, Culbertson C, Brody AL, Yuille AL, Cohen MS. Large sample group independent component analysis of functional magnetic resonance imaging using anatomical atlas-based reduction and bootstrapped clustering. Int J Imaging Syst Technol. 2011;21:223–231. - PMC - PubMed
    1. Biggio B, Nelson B, Laskov P. Support vector machines under adversarial label noise. JMLR: Workshop and Conference Proceedings. 2011;20:1–6.
    1. Björnsdotter M, Rylander K, Wessberg J. A Monte Carlo method for locally-multivariate brain mapping. NeuroImage. 2011;56:508–516. - PubMed
    1. Brodersen KH, Schofield TM, Leff AP, Ong CS, Lomakina EI, Buhmann JM, Stephan KE. Generative embedding for model-based classification of fMRI data. PLoS Comput Biol. 2011;7:e1002079. - PMC - PubMed
    1. Colby JB, Rudie JD, Brown JA, Douglas PK, Cohen MS, Shehzad Z. Insights into multimodal imaging classification of ADHD. Front Syst Neurosci. 2012;6:59. - PMC - PubMed

Publication types