Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 8:11:1336255.
doi: 10.3389/fbioe.2023.1336255. eCollection 2023.

Breaking barriers: a statistical and machine learning-based hybrid system for predicting dementia

Affiliations

Breaking barriers: a statistical and machine learning-based hybrid system for predicting dementia

Ashir Javeed et al. Front Bioeng Biotechnol. .

Abstract

Introduction: Dementia is a condition (a collection of related signs and symptoms) that causes a continuing deterioration in cognitive function, and millions of people are impacted by dementia every year as the world population continues to rise. Conventional approaches for determining dementia rely primarily on clinical examinations, analyzing medical records, and administering cognitive and neuropsychological testing. However, these methods are time-consuming and costly in terms of treatment. Therefore, this study aims to present a noninvasive method for the early prediction of dementia so that preventive steps should be taken to avoid dementia. Methods: We developed a hybrid diagnostic system based on statistical and machine learning (ML) methods that used patient electronic health records to predict dementia. The dataset used for this study was obtained from the Swedish National Study on Aging and Care (SNAC), with a sample size of 43040 and 75 features. The newly constructed diagnostic extracts a subset of useful features from the dataset through a statistical method (F-score). For the classification, we developed an ensemble voting classifier based on five different ML models: decision tree (DT), naive Bayes (NB), logistic regression (LR), support vector machines (SVM), and random forest (RF). To address the problem of ML model overfitting, we used a cross-validation approach to evaluate the performance of the proposed diagnostic system. Various assessment measures, such as accuracy, sensitivity, specificity, receiver operating characteristic (ROC) curve, and Matthew's correlation coefficient (MCC), were used to thoroughly validate the devised diagnostic system's efficiency. Results: According to the experimental results, the proposed diagnostic method achieved the best accuracy of 98.25%, as well as sensitivity of 97.44%, specificity of 95.744%, and MCC of 0.7535. Discussion: The effectiveness of the proposed diagnostic approach is compared to various cutting-edge feature selection techniques and baseline ML models. From experimental results, it is evident that the proposed diagnostic system outperformed the prior feature selection strategies and baseline ML models regarding accuracy.

Keywords: F-score; dementia; feature selection; machine learning; voting classifier.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The reviewer NA declared a shared affiliation with the author AB to the handling editor at the time of review.

Figures

FIGURE 1
FIGURE 1
Samples overview in the collected dataset.
FIGURE 2
FIGURE 2
Working of proposed framework.
FIGURE 3
FIGURE 3
ROC curve analysis of baseline ML models.
FIGURE 4
FIGURE 4
Confusion matrix.
FIGURE 5
FIGURE 5
ROC curve analysis.
FIGURE 6
FIGURE 6
Performance comparison of the proposed method with other feature selection methods.

References

    1. Ahiskali M., Polikar R., Kounios J., Green D., Clark C. M. (2009). “Combining multichannel erp data for early diagnosis of alzheimer’s disease,” in 2009 4th International IEEE/EMBS Conference on Neural Engineering, Antalya, Turkey, April, 2009, 522–525.
    1. Akay M. F. (2009). Support vector machines combined with feature selection for breast cancer diagnosis. Expert Syst. Appl. 36, 3240–3247. 10.1016/j.eswa.2008.01.009 - DOI
    1. Akbar W., Wu W.-p., Saleem S., Farhan M., Saleem M. A., Javeed A., et al. (2020). Development of hepatitis disease detection system by exploiting sparsity in linear support vector machine to improve strength of adaboost ensemble model. Mob. Inf. Syst. 2020, 1–9. 10.1155/2020/8870240 - DOI
    1. Ali L., Rahman A., Khan A., Zhou M., Javeed A., Khan J. A. (2019a). An automated diagnostic system for heart disease prediction based on χ 2 statistical model and optimally configured deep neural network. Ieee Access 7, 34938–34945. 10.1109/access.2019.2904800 - DOI
    1. Ali L., Zhu C., Golilarz N. A., Javeed A., Zhou M., Liu Y. (2019b). Reliable Parkinson’s disease detection by analyzing handwritten drawings: construction of an unbiased cascaded learning system based on feature selection and adaptive boosting model. Ieee Access 7, 116480–116489. 10.1109/access.2019.2932037 - DOI

LinkOut - more resources