Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 22;18(11):e0288903.
doi: 10.1371/journal.pone.0288903. eCollection 2023.

Use of feature importance statistics to accurately predict asthma attacks using machine learning: A cross-sectional cohort study of the US population

Affiliations

Use of feature importance statistics to accurately predict asthma attacks using machine learning: A cross-sectional cohort study of the US population

Alexander A Huang et al. PLoS One. .

Abstract

Background: Asthma attacks are a major cause of morbidity and mortality in vulnerable populations, and identification of associations with asthma attacks is necessary to improve public awareness and the timely delivery of medical interventions.

Objective: The study aimed to identify feature importance of factors associated with asthma in a representative population of US adults.

Methods: A cross-sectional analysis was conducted using a modern, nationally representative cohort, the National Health and Nutrition Examination Surveys (NHANES 2017-2020). All adult patients greater than 18 years of age (total of 7,922 individuals) with information on asthma attacks were included in the study. Univariable regression was used to identify significant nutritional covariates to be included in a machine learning model and feature importance was reported. The acquisition and analysis of the data were authorized by the National Center for Health Statistics Ethics Review Board.

Results: 7,922 patients met the inclusion criteria in this study. The machine learning model had 55 out of a total of 680 features that were found to be significant on univariate analysis (P<0.0001 used). In the XGBoost model the model had an Area Under the Receiver Operator Characteristic Curve (AUROC) = 0.737, Sensitivity = 0.960, NPV = 0.967. The top five highest ranked features by gain, a measure of the percentage contribution of the covariate to the overall model prediction, were Octanoic Acid intake as a Saturated Fatty Acid (SFA) (gm) (Gain = 8.8%), Eosinophil percent (Gain = 7.9%), BMXHIP-Hip Circumference (cm) (Gain = 7.2%), BMXHT-standing height (cm) (Gain = 6.2%) and HS C-Reactive Protein (mg/L) (Gain 6.1%).

Conclusion: Machine Learning models can additionally offer feature importance and additional statistics to help identify associations with asthma attacks.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Receiver operator characteristic curve and model statistics.
The Receiver operating characteristic curve for the machine-learning model predicting whether the patient had an asthma attack within the past year or not. AUROC = 0.737 (P<0.001).
Fig 2
Fig 2. Overall SHAP explanations.
SHAP explanations, purple color representing higher values of the covariate while yellow representing lower values of the covariate. X-axis is the change in log-odds for reporting an asthma attack within the past year.
Fig 3
Fig 3. SHAP explanations for the Top 4 continuous covariates sorted by overall SHAP explanations.
SHAP explanations, covariate value on the x-axis, change in log-odds on the y-axis, red line represents the relationship between the covariate and log-odds for Asthma attacks, each black dot represents an observation. Covariates: top left–Told doctor they had a sleeping disorder, top right–Eosinophil percent (%), bottom left–MCQ520 –Abdominal pain during past 12 month, bottom right–Octanoic Acid intake as a Saturated Fatty Acid (SFA) (gm).

Similar articles

Cited by

References

    1. Anandan C, Nurmatov U, van Schayck OC, Sheikh A. Is the prevalence of asthma declining? Systematic review of epidemiological studies. Allergy. 2010;65(2):152–67. Epub 20091112. doi: 10.1111/j.1398-9995.2009.02244.x . - DOI - PubMed
    1. Price D, Wilson AM, Chisholm A, Rigazio A, Burden A, Thomas M, et al.. Predicting frequent asthma exacerbations using blood eosinophil count and other patient data routinely available in clinical practice. J Asthma Allergy. 2016;9:1–12. Epub 20160107. doi: 10.2147/JAA.S97973 ; PubMed Central PMCID: PMC4708874. - DOI - PMC - PubMed
    1. Blakey JD, Price DB, Pizzichini E, Popov TA, Dimitrov BD, Postma DS, et al.. Identifying Risk of Future Asthma Attacks Using UK Medical Record Data: A Respiratory Effectiveness Group Initiative. J Allergy Clin Immunol Pract. 2017;5(4):1015–24 e8. Epub 20161222. doi: 10.1016/j.jaip.2016.11.007 . - DOI - PubMed
    1. Grana J, Preston S, McDermott PD, Hanchak NA. The use of administrative data to risk-stratify asthmatic patients. Am J Med Qual. 1997;12(2):113–9. doi: 10.1177/0885713X9701200205 . - DOI - PubMed
    1. Mukherjee M, Stoddart A, Gupta RP, Nwaru BI, Farr A, Heaven M, et al.. The epidemiology, healthcare and societal burden and costs of asthma in the UK and its member nations: analyses of standalone and linked national databases. BMC Med. 2016;14(1):113. Epub 20160829. doi: 10.1186/s12916-016-0657-8 ; PubMed Central PMCID: PMC5002970. - DOI - PMC - PubMed