. 2020 Mar 9:2020:4984967.

doi: 10.1155/2020/4984967. eCollection 2020.

Soft Clustering for Enhancing the Diagnosis of Chronic Diseases over Machine Learning Algorithms

Theyazn H H Aldhyani¹, Ali Saleh Alshebami², Mohammed Y Alzahrani³

Affiliations

¹ Department of Computer Sciences and Information Technology, King Faisal University, Al-Hasa 31982, Saudi Arabia.
² Department of Administrative and Financial Sciences, King Faisal University, Al-Hasa 31982, Saudi Arabia.
³ Department of Computer Sciences and Information Technology, Albaha University, Albaha 65527, Saudi Arabia.

PMID: 32211144
PMCID: PMC7085388
DOI: 10.1155/2020/4984967

Soft Clustering for Enhancing the Diagnosis of Chronic Diseases over Machine Learning Algorithms

Theyazn H H Aldhyani et al. J Healthc Eng. 2020.

. 2020 Mar 9:2020:4984967.

doi: 10.1155/2020/4984967. eCollection 2020.

Authors

Theyazn H H Aldhyani¹, Ali Saleh Alshebami², Mohammed Y Alzahrani³

Affiliations

¹ Department of Computer Sciences and Information Technology, King Faisal University, Al-Hasa 31982, Saudi Arabia.
² Department of Administrative and Financial Sciences, King Faisal University, Al-Hasa 31982, Saudi Arabia.
³ Department of Computer Sciences and Information Technology, Albaha University, Albaha 65527, Saudi Arabia.

PMID: 32211144
PMCID: PMC7085388
DOI: 10.1155/2020/4984967

Abstract

Chronic diseases represent a serious threat to public health across the world. It is estimated at about 60% of all deaths worldwide and approximately 43% of the global burden of chronic diseases. Thus, the analysis of the healthcare data has helped health officials, patients, and healthcare communities to perform early detection for those diseases. Extracting the patterns from healthcare data has helped the healthcare communities to obtain complete medical data for the purpose of diagnosis. The objective of the present research work is presented to improve the surveillance detection system for chronic diseases, which is used for the protection of people's lives. For this purpose, the proposed system has been developed to enhance the detection of chronic disease by using machine learning algorithms. The standard data related to chronic diseases have been collected from various worldwide resources. In healthcare data, special chronic diseases include ambiguous objects of the class. Therefore, the presence of ambiguous objects shows the availability of traits involving two or more classes, which reduces the accuracy of the machine learning algorithms. The novelty of the current research work lies in the assumption that demonstrates the noncrisp Rough K-means (RKM) clustering for figuring out the ambiguity in chronic disease dataset to improve the performance of the system. The RKM algorithm has clustered data into two sets, namely, the upper approximation and lower approximation. The objects belonging to the upper approximation are favourable objects, whereas the ones belonging to the lower approximation are excluded and identified as ambiguous. These ambiguous objects have been excluded to improve the machine learning algorithms. The machine learning algorithms, namely, naïve Bayes (NB), support vector machine (SVM), K-nearest neighbors (KNN), and random forest tree, are presented and compared. The chronic disease data are obtained from the machine learning repository and Kaggle to test and evaluate the proposed model. The experimental results demonstrate that the proposed system is successfully employed for the diagnosis of chronic diseases. The proposed model achieved the best results with naive Bayes with RKM for the classification of diabetic disease (80.55%), whereas SVM with RKM for the classification of kidney disease achieved 100% and SVM with RKM for the classification of cancer disease achieved 97.53 with respect to accuracy metric. The performance measures, such as accuracy, sensitivity, specificity, precision, and F-score, are employed to evaluate the performance of the proposed system. Furthermore, evaluation and comparison of the proposed system with the existing machine learning algorithms are presented. Finally, the proposed system has enhanced the performance of machine learning algorithms.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

**Figure 2**
Sample of ambiguous objects.

**Figure 3**
Snapshot of output RKM algorithm.

**Figure 4**
Comparison results of existing naïve Bayes classifier and naïve Bayes using RKM algorithm for diabetic diseases.

**Figure 5**
Comparison results of existing SVM classifier and SVM using RKM algorithm for diabetic diseases.

**Figure 6**
Comparison results of existing random forest classifier and random forest using RKM algorithm for diabetic diseases.

**Figure 7**
Comparison results of existing KNN classifier and KNN using RKM algorithm for diabetic diseases.

**Figure 8**
Comparison of results of existing naïve Bayes classifier and naïve Bayes using RKM algorithm for kidney diseases.

**Figure 9**
Comparison of results of existing SVM classifier and SVM using the RKM algorithm for kidney diseases.

**Figure 10**
Comparison of results of existing random forest classifier and random forest using the RKM algorithm for kidney diseases.

**Figure 11**
Comparison results of existing KNN classifier and KNN using the RKM algorithm for kidney diseases.

**Figure 12**
Comparison of results of the existing naïve Bayes classifier and naïve Bayes using the RKM algorithm for cancer disease.

**Figure 13**
Comparison of results of the existing SVM classifier and SVM using the RKM algorithm for cancer disease.

**Figure 14**
Comparison of results of the existing random forest classifier and random forest using the RKM algorithm for cancer disease.

**Figure 15**
Comparison of results of the existing KNN classifier and KNN using the RKM algorithm for cancer disease.

See this image and copyright information in PMC

References

1. https://www.cdc.gov/
1. https://www.who.int/
1. Witten J. H., Frank E. Data Mining: practical machine learning tools and techniques. International Journal on Cybernetics & Informatics (IJCI) 2015;4(4)
1. Solanki A. V. Data mining techniques using WEKA classification for sickle cell disease. International Journal of Computer Science and Information Technology. 2014;5(4):5857–5860.
1. Joshi J., Rinal D., Patel J. Diagnosis and prognosis of breast cancer using classification rules. International Journal of Engineering Research and General Science. 2014;2(6):315–323.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Soft Clustering for Enhancing the Diagnosis of Chronic Diseases over Machine Learning Algorithms

Affiliations

Soft Clustering for Enhancing the Diagnosis of Chronic Diseases over Machine Learning Algorithms

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical