An Improvised Machine Learning Model Based on Mutual Information Feature Selection Approach for Microbes Classification
- PMID: 33672252
- PMCID: PMC7927045
- DOI: 10.3390/e23020257
An Improvised Machine Learning Model Based on Mutual Information Feature Selection Approach for Microbes Classification
Abstract
The accurate classification of microbes is critical in today's context for monitoring the ecological balance of a habitat. Hence, in this research work, a novel method to automate the process of identifying microorganisms has been implemented. To extract the bodies of microorganisms accurately, a generalized segmentation mechanism which consists of a combination of convolution filter (Kirsch) and a variance-based pixel clustering algorithm (Otsu) is proposed. With exhaustive corroboration, a set of twenty-five features were identified to map the characteristics and morphology for all kinds of microbes. Multiple techniques for feature selection were tested and it was found that mutual information (MI)-based models gave the best performance. Exhaustive hyperparameter tuning of multilayer layer perceptron (MLP), k-nearest neighbors (KNN), quadratic discriminant analysis (QDA), logistic regression (LR), and support vector machine (SVM) was done. It was found that SVM radial required further improvisation to attain a maximum possible level of accuracy. Comparative analysis between SVM and improvised SVM (ISVM) through a 10-fold cross validation method ultimately showed that ISVM resulted in a 2% higher performance in terms of accuracy (98.2%), precision (98.2%), recall (98.1%), and F1 score (98.1%).
Keywords: classification; image segmentation; k-fold cross validation; machine learning modeling; microorganisms; mutual information.
Conflict of interest statement
The authors declare no conflict of interest.
Similar articles
-
Computer-assisted lip diagnosis on Traditional Chinese Medicine using multi-class support vector machines.BMC Complement Altern Med. 2012 Aug 16;12:127. doi: 10.1186/1472-6882-12-127. BMC Complement Altern Med. 2012. PMID: 22898352 Free PMC article.
-
Estimation of non-alcoholic steatohepatitis (NASH) disease using clinical information based on the optimal combination of intelligent algorithms for feature selection and classification.Comput Methods Biomech Biomed Engin. 2024 Jun;27(8):964-979. doi: 10.1080/10255842.2023.2217978. Epub 2023 May 31. Comput Methods Biomech Biomed Engin. 2024. PMID: 37254745
-
Enhanced Superpixel-Guided ResNet Framework with Optimized Deep-Weighted Averaging-Based Feature Fusion for Lung Cancer Detection in Histopathological Images.Diagnostics (Basel). 2025 Mar 21;15(7):805. doi: 10.3390/diagnostics15070805. Diagnostics (Basel). 2025. PMID: 40218155 Free PMC article.
-
Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage.BMC Med Inform Decis Mak. 2022 Oct 25;22(1):278. doi: 10.1186/s12911-022-02018-x. BMC Med Inform Decis Mak. 2022. PMID: 36284327 Free PMC article.
-
COVID-19 diagnosis: A comprehensive review of pre-trained deep learning models based on feature extraction algorithm.Results Eng. 2023 Jun;18:101020. doi: 10.1016/j.rineng.2023.101020. Epub 2023 Mar 16. Results Eng. 2023. PMID: 36945336 Free PMC article. Review.
Cited by
-
Prediction of the Critical Temperature of Superconductors Based on Two-Layer Feature Selection and the Optuna-Stacking Ensemble Learning Model.ACS Omega. 2023 Jan 13;8(3):3078-3090. doi: 10.1021/acsomega.2c06324. eCollection 2023 Jan 24. ACS Omega. 2023. PMID: 36713747 Free PMC article.
-
Towards an Effective Intrusion Detection Model Using Focal Loss Variational Autoencoder for Internet of Things (IoT).Sensors (Basel). 2022 Aug 4;22(15):5822. doi: 10.3390/s22155822. Sensors (Basel). 2022. PMID: 35957379 Free PMC article.
References
-
- Turak E., Harrison I., Dudgeon D., Abell R., Bush A., Darwall W., Finlayson C.M., Ferrier S., Freyhof J., Hermoso V., et al. Essential Biodiversity Variables for Measuring Change in Global Freshwater Biodiversity. Biol. Conserv. 2017;3:272–279. doi: 10.1016/j.biocon.2016.09.005. - DOI
-
- Morris R.A. Biodiversity Informatics. In: Levin S., editor. Encyclopedia of Biodiversity. 2nd ed. Elsevier; Amsterdam, The Netherlands: 2013. pp. 440–445.
-
- Guo X., Coops N.C., Tompalski P., Nielsen S.E., Bater C.W., John Stadt J. Regional Mapping of Vegetation Structure for Biodiversity Monitoring Using Airborne Lidar Data. Ecol. Inform. 2017;38:50–61. doi: 10.1016/j.ecoinf.2017.01.005. - DOI
-
- Janicki J., Narula N., Ziegler M., Guénard B., Economo E.P. Visualizing and Interacting with Large-Volume Biodiversity Data Using Client-Server Web-Mapping Applications: The Design and Implementation of Antmaps. Org. Ecol. Inform. 2016;32:185–193. doi: 10.1016/j.ecoinf.2016.02.006. - DOI
LinkOut - more resources
Full Text Sources
Other Literature Sources