Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 6:9:100183.
doi: 10.1016/j.gloepi.2025.100183. eCollection 2025 Jun.

Modeling the determinants of attrition in a two-stage epilepsy prevalence survey in Nairobi using machine learning

Collaborators, Affiliations

Modeling the determinants of attrition in a two-stage epilepsy prevalence survey in Nairobi using machine learning

Daniel M Mwanga et al. Glob Epidemiol. .

Abstract

Background: Attrition is a challenge in parameter estimation in both longitudinal and multi-stage cross-sectional studies. Here, we examine utility of machine learning to predict attrition and identify associated factors in a two-stage population-based epilepsy prevalence study in Nairobi.

Methods: All individuals in the Nairobi Urban Health and Demographic Surveillance System (NUHDSS) (Korogocho and Viwandani) were screened for epilepsy in two stages. Attrition was defined as probable epilepsy cases identified at stage-I but who did not attend stage-II (neurologist assessment). Categorical variables were one-hot encoded, class imbalance was addressed using synthetic minority over-sampling technique (SMOTE) and numeric variables were scaled and centered. The dataset was split into training and testing sets (7:3 ratio), and seven machine learning models, including the ensemble Super Learner, were trained. Hyperparameters were tuned using 10-fold cross-validation, and model performance evaluated using metrics like Area under the curve (AUC), accuracy, Brier score and F1 score over 500 bootstrap samples of the test data.

Results: Random forest (AUC = 0.98, accuracy = 0.95, Brier score = 0.06, and F1 = 0.94), extreme gradient boost (XGB) (AUC = 0.96, accuracy = 0.91, Brier score = 0.08, F1 = 0.90) and support vector machine (SVM) (AUC = 0.93, accuracy = 0.93, Brier score = 0.07, F1 = 0.92) were the best performing models (base learners). Ensemble Super Learner had similarly high performance. Important predictors of attrition included proximity to industrial areas, male gender, employment, education, smaller households, and a history of complex partial seizures.

Conclusion: These findings can aid researchers plan targeted mobilization for scheduled clinical appointments to improve follow-up rates. These findings will inform development of a web-based algorithm to predict attrition risk and aid in targeted follow-up efforts in similar studies.

Keywords: Attrition; Epilepsy; Loss to follow-up; Machine learning; Prevalence; Surveillance; Urban settlements.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Feature importance for prediction based on χ2 scores
Fig. 2
Fig. 2
Visualization of model performance metrics - AUC, F1, Accuracy and Brier scores.
Fig. 3
Fig. 3
Feature rank comparison across models.
Fig. 4
Fig. 4
Top 10 features based on importance ranking across the three best performing models.

References

    1. WHO. Epilepsy. In: Epilepsy [Internet]. 9 Feb 2023. World Health Organization, url: https://www.who.int/news-room/fact-sheets/detail/epilepsy, 2023.
    1. Ngugi Anthony K., Bottomley Christian, Kleinschmidt Immo, Wagner Ryan G., Kakooza-Mwesige Angelina, Ae-Ngibise Kenneth, et al. Prevalence of active convulsive epilepsy in sub-saharan africa and associated risk factors: cross-sectional and case-control studies. Lancet Neurol. 2013;12(3):253–263. - PMC - PubMed
    1. Ngugi Anthony K., Bottomley Christian, Kleinschmidt Immo, Sander Josemir W., Newton Charles R. Estimation of the burden of active and life-time epilepsy: a meta-analytic approach. Epilepsia. 2010;51(5):883–890. - PMC - PubMed
    1. Kariuki Symon M., Ngugi Anthony K., Kombe Martha Z., Kazungu Michael, Chengo Eddie, Odhiambo Rachael, et al. Prevalence and mortality of epilepsies with convulsive and non-convulsive seizures in Kilifi, Kenya. Seizure. 2021;89:51–55. doi: 10.1016/j.seizure.2021.04.028. - DOI - PMC - PubMed
    1. Stelzle Dominik, Schmidt Veronika, Ngowi Bernard J., Matuja William, Schmutzhard Erich, Winkler Andrea S. Lifetime prevalence of epilepsy in urban tanzania–a door-to-door random cluster survey. eNeurologicalSci. 2021;24:100352. doi: 10.1016/j.ensci.2021.100352. - DOI - PMC - PubMed

LinkOut - more resources