Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 20;13(1):2978.
doi: 10.1038/s41598-023-30188-9.

Risk factors and geographic disparities in premature cardiovascular mortality in US counties: a machine learning approach

Affiliations

Risk factors and geographic disparities in premature cardiovascular mortality in US counties: a machine learning approach

Weichuan Dong et al. Sci Rep. .

Erratum in

Abstract

Disparities in premature cardiovascular mortality (PCVM) have been associated with socioeconomic, behavioral, and environmental risk factors. Understanding the "phenotypes", or combinations of characteristics associated with the highest risk of PCVM, and the geographic distributions of these phenotypes is critical to targeting PCVM interventions. This study applied the classification and regression tree (CART) to identify county phenotypes of PCVM and geographic information systems to examine the distributions of identified phenotypes. Random forest analysis was applied to evaluate the relative importance of risk factors associated with PCVM. The CART analysis identified seven county phenotypes of PCVM, where high-risk phenotypes were characterized by having greater percentages of people with lower income, higher physical inactivity, and higher food insecurity. These high-risk phenotypes were mostly concentrated in the Black Belt of the American South and the Appalachian region. The random forest analysis identified additional important risk factors associated with PCVM, including broadband access, smoking, receipt of Supplemental Nutrition Assistance Program benefits, and educational attainment. Our study demonstrates the use of machine learning approaches in characterizing community-level phenotypes of PCVM. Interventions to reduce PCVM should be tailored according to these phenotypes in corresponding geographic areas.

PubMed Disclaimer

Conflict of interest statement

Dr. Dong is supported by contracts from Cleveland Clinic Foundation, including a subcontract from Celgene Corporation. Dr. Dong does not have other competing interests to report. Dr. Motairek, Dr. Nasir, Mr. Chen, Dr. Kim, Dr. Khalifa, Dr. Freedman, Dr. Griggs, Dr. Rajagopalan, and Dr. Al-Kindi do not have any competing interest.

Figures

Figure 1
Figure 1
Classification and regression tree analysis (200 minimum counties at a terminal node) to predict county-level premature cardiovascular mortality (PCVM) using counties in the training set (N = 2008). Notes: Each path down to a terminal node represents a county phenotype. Box plots in the terminal nodes represent age-adjusted PCVM (per 100,000 people).
Figure 2
Figure 2
Characteristics of county premature cardiovascular mortality (PCVM) phenotypes identified by CART. Notes: Counties in training and test sets were both included. Maps were created by Python v3.10.6 (https://www.python.org/) and its libraries: geopandas (v0.11.1) and matplotlib (v3.5.3).
Figure 3
Figure 3
US County Maps of (A) age-adjusted premature cardiovascular mortality (per 100,000 people), and (B) county phenotypes of premature cardiovascular mortality. Note: maps were created by ArcGIS Pro v2.7.0 (https://pro.arcgis.com/).
Figure 4
Figure 4
Relative importance plot of risk factors in predicting county-level age-adjusted premature cardiovascular mortality from the random forest analysis. Notes: the most important variable is at the top and scaled to 100%. The importance of the rest of the variables is shown relative to the top variable. Abbreviations: SNAP supplemental nutrition assistance program; PM fine particulate matter; RMP risk management plan; NLP national priorities list.

References

    1. Tsao CW, Aday AW, Almarzooq ZI, et al. Heart disease and stroke statistics—2022 update: A report from the American Heart Association. Circulation. 2022;145(8):e153–639. doi: 10.1161/CIR.0000000000001052. - DOI - PubMed
    1. Jin Y, Song S, Zhang L, et al. Disparities in premature cardiac death among US counties from 1999–2017: Temporal trends and key drivers. J. Am. Heart Assoc. 2020;9:e016340. doi: 10.1161/JAHA.120.016340. - DOI - PMC - PubMed
    1. Ritchey MD, Wall HK, George MG, Wright JS. US trends in premature heart disease mortality over the past 50 years: Where do we go from here? Trends Cardiovasc. Med. 2020;30:364–374. doi: 10.1016/j.tcm.2019.09.005. - DOI - PMC - PubMed
    1. Roth GA, Dwyer-Lindgren L, Bertozzi-Villa A, et al. Trends and patterns of geographic variation in cardiovascular mortality among US counties, 1980–2014. JAMA. 2017;317:1976–1992. doi: 10.1001/jama.2017.4150. - DOI - PMC - PubMed
    1. Ghani AR, Mughal MS, Kumar S, et al. The contemporary trends and geographic variation in premature mortality due to heart failure from 1999 to 2018 in the United States. Int. J. Cardiol. Heart Vasc. 2021;34:100812. - PMC - PubMed

Publication types