. 2018 Aug 28;13(8):e0202705.

doi: 10.1371/journal.pone.0202705. eCollection 2018.

uEFS: An efficient and comprehensive ensemble-based feature selection methodology to select informative features

Maqbool Ali^{1

2}, Syed Imran Ali¹, Dohyeong Kim¹, Taeho Hur¹, Jaehun Bang¹, Sungyoung Lee¹, Byeong Ho Kang², Maqbool Hussain³

Affiliations

¹ Department of Computer Science and Engineering, Kyung Hee University, Yongin, Gyeonggi, Republic of Korea.
² School of Engineering and ICT, University of Tasmania, Hobart, Tasmania, Australia.
³ Department of Software, Sejong University, Seoul, Gyeonggi, Republic of Korea.

PMID: 30153294
PMCID: PMC6112679
DOI: 10.1371/journal.pone.0202705

uEFS: An efficient and comprehensive ensemble-based feature selection methodology to select informative features

Maqbool Ali et al. PLoS One. 2018.

. 2018 Aug 28;13(8):e0202705.

doi: 10.1371/journal.pone.0202705. eCollection 2018.

Authors

Maqbool Ali^{1

2}, Syed Imran Ali¹, Dohyeong Kim¹, Taeho Hur¹, Jaehun Bang¹, Sungyoung Lee¹, Byeong Ho Kang², Maqbool Hussain³

Affiliations

¹ Department of Computer Science and Engineering, Kyung Hee University, Yongin, Gyeonggi, Republic of Korea.
² School of Engineering and ICT, University of Tasmania, Hobart, Tasmania, Australia.
³ Department of Software, Sejong University, Seoul, Gyeonggi, Republic of Korea.

PMID: 30153294
PMCID: PMC6112679
DOI: 10.1371/journal.pone.0202705

Abstract

Feature selection is considered to be one of the most critical methods for choosing appropriate features from a larger set of items. This task requires two basic steps: ranking and filtering. Of these, the former necessitates the ranking of all features, while the latter involves filtering out all irrelevant features based on some threshold value. In this regard, several feature selection methods with well-documented capabilities and limitations have already been proposed. Similarly, feature ranking is also nontrivial, as it requires the designation of an optimal cutoff value so as to properly select important features from a list of candidate features. However, the availability of a comprehensive feature ranking and a filtering approach, which alleviates the existing limitations and provides an efficient mechanism for achieving optimal results, is a major problem. Keeping in view these facts, we present an efficient and comprehensive univariate ensemble-based feature selection (uEFS) methodology to select informative features from an input dataset. For the uEFS methodology, we first propose a unified features scoring (UFS) algorithm to generate a final ranked list of features following a comprehensive evaluation of a feature set. For defining cutoff points to remove irrelevant features, we subsequently present a threshold value selection (TVS) algorithm to select a subset of features that are deemed important for the classifier construction. The uEFS methodology is evaluated using standard benchmark datasets. The extensive experimental results show that our proposed uEFS methodology provides competitive accuracy and achieved (1) on average around a 7% increase in f-measure, and (2) on average around a 5% increase in predictive accuracy as compared with state-of-the-art methods.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 4. An average predictive accuracy graph using the 10-fold cross-validation technique for threshold value identification.**

**Fig 5. An average predictive accuracy graph using training datasets for threshold value identification.**

**Fig 6. Predictive accuracies of classifiers against benchmark datasets with varying percentages of retained features.**

**Fig 7. Comparisons of F-measure with existing FS measures.**

**Fig 8. Comparisons of F-measure with existing FS measures [29, 37, 39, 48].**

**Fig 9. Comparisons of predictive accuracy with existing FS measures [29, 37, 39, 48].**

See this image and copyright information in PMC

References

1. Altidor W, Khoshgoftaar TM, Van Hulse J, Napolitano A. Ensemble feature ranking methods for data intensive computing applications In: Handbook of data intensive computing. Springer; 2011. p. 349–376.
1. Saeys Y, Abeel T, Van de Peer Y. Robust feature selection using ensemble feature selection techniques In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer; 2008. p. 313–325.
1. Whiteson S, Stone P, Stanley KO, Miikkulainen R, Kohl N. Automatic feature selection in neuroevolution. In: Proceedings of the 7th annual conference on Genetic and evolutionary computation. ACM; 2005. p. 1225–1232.
1. Stoean R, Gorunescu F. A survey on feature ranking by means of evolutionary computation. Annals of the University of Craiova-Mathematics and Computer Science Series. 2013;40(1):100–105.
1. Dhote Y, Agrawal S, Deen AJ. A survey on feature selection techniques for internet traffic classification. In: Computational Intelligence and Communication Networks (CICN), 2015 International Conference on. IEEE; 2015. p. 1375–1380.

Publication types

Actions

MeSH terms

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

uEFS: An efficient and comprehensive ensemble-based feature selection methodology to select informative features

Affiliations

uEFS: An efficient and comprehensive ensemble-based feature selection methodology to select informative features

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources