. 2025 Feb 20;15(1):6132.

doi: 10.1038/s41598-025-90530-1.

An extensive experimental analysis for heart disease prediction using artificial intelligence techniques

D Rohan¹, G Pradeep Reddy², Y V Pavan Kumar³, K Purna Prakash⁴, Ch Pradeep Reddy⁵

Affiliations

¹ School of Computer Science and Engineering, VIT-AP University, Amaravati, 522241, Andhra Pradesh, India.
² Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, 576104, Karnataka, India. pradeep.reddy@manipal.edu.
³ School of Electronics Engineering, VIT-AP University, Amaravati, 522241, Andhra Pradesh, India.
⁴ Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, 522502, Andhra Pradesh, India.
⁵ Department of Computer Science, Chandigarh Engineering College, Chandigarh Group of Colleges Jhanjeri, Mohali, 140307, Punjab, India.

PMID: 39972004
PMCID: PMC11839996
DOI: 10.1038/s41598-025-90530-1

An extensive experimental analysis for heart disease prediction using artificial intelligence techniques

D Rohan et al. Sci Rep. 2025.

. 2025 Feb 20;15(1):6132.

doi: 10.1038/s41598-025-90530-1.

Authors

D Rohan¹, G Pradeep Reddy², Y V Pavan Kumar³, K Purna Prakash⁴, Ch Pradeep Reddy⁵

Affiliations

¹ School of Computer Science and Engineering, VIT-AP University, Amaravati, 522241, Andhra Pradesh, India.
² Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, 576104, Karnataka, India. pradeep.reddy@manipal.edu.
³ School of Electronics Engineering, VIT-AP University, Amaravati, 522241, Andhra Pradesh, India.
⁴ Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, 522502, Andhra Pradesh, India.
⁵ Department of Computer Science, Chandigarh Engineering College, Chandigarh Group of Colleges Jhanjeri, Mohali, 140307, Punjab, India.

PMID: 39972004
PMCID: PMC11839996
DOI: 10.1038/s41598-025-90530-1

Abstract

The heart is an important organ that plays a crucial role in maintaining life. Unfortunately, heart disease is one of the major causes of mortality globally. Early and accurate detection can significantly improve the situation by enabling preventive measures and personalized healthcare recommendations. Artificial intelligence is emerging as a powerful tool for healthcare applications, particularly in predicting heart diseases. Researchers are actively working on this, but challenges remain in achieving accurate heart disease prediction. Therefore, experimenting with various models to identify the most effective one for heart disease prediction is crucial. In this view, this paper addresses this need by conducting an extensive investigation of various models. The proposed research considered 11 feature selection techniques and 21 classifiers for the experiment. The feature selection techniques considered for the research are Information Gain, Chi-Square Test, Fisher Discriminant Analysis (FDA), Variance Threshold, Mean Absolute Difference (MAD), Dispersion Ratio, Relief, LASSO, Random Forest Importance, Linear Discriminant Analysis (LDA), and Principal Component Analysis (PCA). The classifiers considered for the research are Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Gaussian Naïve Bayes (GNB), XGBoost, AdaBoost, Stochastic Gradient Descent (SGD), Gradient Boosting Classifier, Extra Tree Classifier, CatBoost, LightGBM, Multilayer Perceptron (MLP), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Bidirectional LSTM (BiLSTM), Bidirectional GRU (BiGRU), Convolutional Neural Network (CNN), and Hybrid Model (CNN, RNN, LSTM, GRU, BiLSTM, BiGRU). Among all the extensive experiments, XGBoost outperformed all others, achieving an accuracy of 0.97, precision of 0.97, sensitivity of 0.98, specificity of 0.98, F1 score of 0.98, and AUC of 0.98.

Keywords: Artificial intelligence; Deep learning; Feature selection; Heart disease prediction; Machine learning; Performance metrics; XGBoost.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing Interests: The authors declare no Competing interests.

Figures

**Fig. 1**
Execution flow of the experiment.

**Algorithm 1**
Pseudocode for calculating information gain

**Algorithm 2**
Pseudocode for Chi-square test

**Algorithm 6**
Pseudocode for the Relief

**Algorithm 7**
Pseudocode for the Random Forest Importance

**Algorithm 10**
Pseudocode for AdaBoost

**Algorithm 14**
Pseudocode for CatBoost

**Algorithm 15**
Pseudocode for LightGBM

**Fig. 10**
Model architecture of Bi-LSTM.

**Fig. 12**
Model architecture of Bi-GRU.

**Fig. 14**
Model architecture of hybrid model.

**Fig. 15**
Feature importance of information gain.

**Fig. 16**
Chi-square test results of the features.

**Fig. 18**
Features selected by variance threshold.

**Fig. 19**
Mean absolute difference of the features.

**Fig. 20**
Dispersion ratio of the features.

**Fig. 22**
Feature scores of Lasso regularization.

**Fig. 23**
Random forest importance Feature scores.

**Fig. 26**
LR without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 27**
DT without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 28**
RF without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 29**
KNN without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 30**
SVM without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 31**
GNB without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 32**
XGBoost without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 33**
AdaBoost without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 34**
SGD without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 35**
GB without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 36**
etc without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 37**
CatBoost without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 38**
LightGBM without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 39**
MLP without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 40**
RNN without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 41**
LSTM without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 42**
LSTM without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 43**
Bi-LSTM without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 44**
Bi-GRU without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 45**
CNN without feature selection (a) Confusion matrix (b) ROC Curve.

**Fig. 46**
Hybrid model without feature selection (a) Confusion matrix (b) ROC Curve.

See this image and copyright information in PMC

Cited by

Association of early enoxaparin prophylactic anticoagulation with ICU mortality in critically ill patients with chronic obstructive pulmonary disease: a machine learning-based retrospective cohort study.
Lin S, Zhang J, Dang X, Zhan Q. Lin S, et al. Front Pharmacol. 2025 May 12;16:1588846. doi: 10.3389/fphar.2025.1588846. eCollection 2025. Front Pharmacol. 2025. PMID: 40421211 Free PMC article.

References

1. World Heart Report 2023: Confronting the World’s Number One Killer. geneva, switzerland. world heart federation. 2023. https://world-heart-federation.org/resource/world-heart-report-2023/. Accessed: 17-06-2024.
1. Sanjeev, S., Balne, C. C. S., Reddy, T. J. & Reddy, G. Deep learning-based mixed data approach for COVID-19 detection. In 2021 IEEE 18th India Council International Conference (INDICON), 1–6, 10.1109/INDICON52576.2021.9691563 (IEEE).
1. Reddy, G. P., Kumar, Y. V. P., & Explainable, A. I. (XAI) Explained. In,. IEEE Open Conference of Electrical. Electronic and Information Sciences (eStream)1–6, 10.1109/eStream59056.2023.10134984 (IEEE) (2023).
1. Reddy, G. P. & Pavan Kumar, Y. V. A beginner’s guide to federated learning. In 2023 Intelligent Methods, Systems, and Applications (IMSA), 557–562, 10.1109/IMSA58542.2023.10217383 (IEEE).
1. Ali, L. et al. An automated diagnostic system for heart disease prediction based on chi-square statistical model and optimally configured deep neural network. IEEE Access7, 34938–34945, 10.1109/ACCESS.2019.2904800.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

An extensive experimental analysis for heart disease prediction using artificial intelligence techniques

Affiliations

An extensive experimental analysis for heart disease prediction using artificial intelligence techniques

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Medical