Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul;77(3):302-311.
doi: 10.1016/j.mjafi.2020.10.013. Epub 2021 Jan 6.

Machine learning-based heart disease prediction system for Indian population: An exploratory study done in South India

Affiliations

Machine learning-based heart disease prediction system for Indian population: An exploratory study done in South India

Ekta Maini et al. Med J Armed Forces India. 2021 Jul.

Abstract

Background: In India, huge mortality occurs due to cardiovascular diseases (CVDs) as these diseases are not diagnosed in early stages. Machine learning (ML) algorithms can be used to build efficient and economical prediction system for early diagnosis of CVDs in India.

Methods: A total of 1670 anonymized medical records were collected from a tertiary hospital in South India. Seventy percent of the collected data were used to train the prediction system. Five state-of-the-art ML algorithms (k-Nearest Neighbours, Naïve Bayes, Logistic Regression, AdaBoost and Random Forest [RF]) were applied using Python programming language to develop the prediction system. The performance was evaluated over remaining 30% of data. The prediction system was later deployed in the cloud for easy accessibility via Internet.

Results: ML effectively predicted the risk of heart disease. The best performing (RF) prediction system correctly classified 470 out of 501 medical records thus attaining a diagnostic accuracy of 93.8%. Sensitivity and specificity were observed to be 92.8% and 94.6%, respectively. The prediction system attained positive predictive value of 94% and negative predictive value of 93.6%. The prediction model developed in this study can be accessed at http://das.southeastasia.cloudapp.azure.com/predict/.

Conclusions: ML-based prediction system developed in this study performs well in early diagnosis of CVDs and can be accessed via Internet. This study offers promising results suggesting potential use of ML-based heart disease prediction system as a screening tool to diagnose heart diseases in primary healthcare centres in India, which would otherwise get undetected.

Keywords: Affordable healthcare; Cardiovascular diseases; Early diagnosis; Machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors have none to declare.

Figures

Fig. 1
Fig. 1
Workflow diagram of the study. This figure depicts the complete workflow of the study. The medical data set of 1670 records were gathered (in random fashion). Seventy percent data samples used to train the models. Test subset comprised the rest 30% of medical records. Five machine learning algorithms are applied to train the training subset. The prediction system was hosted on the public cloud for easy accessibility.
Fig. 2
Fig. 2
Study population characteristics mean (standard deviation) of numerical attributes along with p-values of t-test to indicate the statistical significance for two groups: high risk/low risk of cardiovascular disease (CVDs). Count (%) of categorical attributes in two groups: high risk/low risk of CVDs.
Fig. 3
Fig. 3
Variable importance. (a) Variable importance for AdaBoost-based prediction model. (b) Variable importance for Random Forest–based prediction model.
Fig. 4
Fig. 4
Using cardiovascular disease (CVD) prediction model to test the risk of CVDs. The medical practitioner enters the patient's clinical parameters as well as attributes related to his lifestyle to predict the risk of CVD.

References

    1. Noncommunicable Diseases Country Profiles. World Health Organization; 2018. https://www.who.int/nmh/publications/ncd-profiles-2018/en/ [Internet] 2019 [cited 17 December 2019]. Available from:
    1. Institute for Health Metrics and Evaluation (IHME). Findings from the Global Burden of Disease Study 2017. IHME; Seattle, WA: 2018. http://www.healthdata.org/sites/default/files/files/policy_report/2019/G... [Internet]. Healthdata.org [cited 17 December 2019] Available from:
    1. Prabhakaran D., Jeemon P., Sharma M. The changing patterns of cardiovascular diseases and their risk factors in the states of India: the Global Burden of Disease Study 1990–2016. Lancet Glob Health. 2018 doi: 10.1016/s2214-109x(18)30407-8. - DOI - PMC - PubMed
    1. Kasthuri A. Challenges to healthcare in India - the five A's. Indian J Community Med. 2018;43(3):141–143. doi: 10.4103/ijcm.IJCM_194_18. - DOI - PMC - PubMed
    1. George A., Badagabettu S., Berra K., George L.S., Kamath V., Thimmappa L. Prevention of cardiovascular disease in India: barriers and opportunities for nursing. J Clin Prev Cardiol. 2018;7:72–77.

LinkOut - more resources