Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct;76(9):864-875.
doi: 10.1177/00033197241244814. Epub 2024 Apr 3.

County-Level Socio-Environmental Factors Associated With Stroke Mortality in the United States: A Cross-Sectional Study

Affiliations

County-Level Socio-Environmental Factors Associated With Stroke Mortality in the United States: A Cross-Sectional Study

Pedro R V O Salerno et al. Angiology. 2025 Oct.

Abstract

We used machine learning methods to explore sociodemographic and environmental determinants of health (SEDH) associated with county-level stroke mortality in the USA. We conducted a cross-sectional analysis of individuals aged ≥15 years who died from all stroke subtypes between 2016 and 2020. We analyzed 54 county-level SEDH possibly associated with age-adjusted stroke mortality rates/100,000 people. Classification and Regression Tree (CART) was used to identify specific county-level clusters associated with stroke mortality. Variable importance was assessed using Random Forest analysis. A total of 501,391 decedents from 2397 counties were included. CART identified 10 clusters, with 77.5% relative increase in stroke mortality rates across the spectrum (28.5 vs 50.7 per 100,000 persons). CART identified 8 SEDH to guide the classification of the county clusters. Including, annual Median Household Income ($), live births with Low Birthweight (%), current adult Smokers (%), adults reporting Severe Housing Problems (%), adequate Access to Exercise (%), adults reporting Physical Inactivity (%), adults with diagnosed Diabetes (%), and adults reporting Excessive Drinking (%). In conclusion, SEDH exposures have a complex relationship with stroke. Machine learning approaches can help deconstruct this relationship and demonstrate associations that allow improved understanding of the socio-environmental drivers of stroke and development of targeted interventions.

Keywords: epidemiology; health policy; machine learning; sociodemographic and environmental determinants of health; stroke.

PubMed Disclaimer

Conflict of interest statement

Declaration of Conflicting InterestsThe author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

Figure 1.
Figure 1.
Classification and regression tree (CART) analysis to predict county-level stroke age-adjusted mortality rate. Notes: Each path down to a terminal node represents a county SEDH cluster. Box plots in the terminal nodes represent the median age-adjusted stroke mortality (per 100,000 people). The minimum number of counties in a terminal node was set to 150. Clusters were labeled alphabetically based on median age-adjusted all stroke mortality rate, from lowest to highest Table 1. Dot chart of random forest analysis showing variable importance for predicting county-level age-adjusted cardio-oncology mortality. Notes: the most important variable is at the top and scaled to 100%. The importance of the rest of the variables is shown relative to the top one. Full descriptions of the variables available in Table 1. Abbreviations: RMP, Risk Management Plan; NPL, National Priorities List.
Figure 2.
Figure 2.
United States County Maps of (A) age-adjusted stroke mortality (per 100,000 people), divided by natural breaks into 10 groups. (B) County clusters, A-J, from smallest to largest age-adjusted stroke mortality.
Figure 3.
Figure 3.
Dot chart of random forest analysis showing variable importance for predicting county-level age-adjusted cardio-oncology mortality. Notes: the most important variable is at the top and scaled to 100%. The importance of the rest of the variables is shown relative to the top one. Full descriptions of the variables available in Table 1. Abbreviations: RMP, Risk Management Plan; NPL, National Priorities List.

References

    1. Tong X The Burden of Cerebrovascular Disease in the United States. Prev Chronic Dis. 2019;16. - PMC - PubMed
    1. Virani SS, Alonso A, Benjamin EJ, et al. Heart Disease and Stroke Statistics—2020 Update: A Report From the American Heart Association. Circulation. 2020;141:e139–e596. - PubMed
    1. Obisesan TO, Vargas CM, Gillum RF. Geographic Variation in Stroke Risk in the United States. Stroke. 2000;31:19–25. - PubMed
    1. Boehme AK, Esenwa C, Elkind MSV. Stroke Risk Factors, Genetics, and Prevention. Circ Res. 2017;120:472–495. - PMC - PubMed
    1. Rajagopalan S, Al-Kindi SG, Brook RD. Air Pollution and Cardiovascular Disease: JACC State-of-the-Art Review. J Am Coll Cardiol. 2018;72:2054–2070. - PubMed