Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr;5(4):e185-e193.
doi: 10.1016/S2589-7500(22)00255-2.

Development and validation of a diagnostic aid for convulsive epilepsy in sub-Saharan Africa: a retrospective case-control study

Collaborators, Affiliations

Development and validation of a diagnostic aid for convulsive epilepsy in sub-Saharan Africa: a retrospective case-control study

Gabriel Davis Jones et al. Lancet Digit Health. 2023 Apr.

Abstract

Background: Identification of convulsive epilepsy in sub-Saharan Africa relies on access to resources that are often unavailable. Infrastructure and resource requirements can further complicate case verification. Using machine-learning techniques, we have developed and tested a region-specific questionnaire panel and predictive model to identify people who have had a convulsive seizure. These findings have been implemented into a free app for health-care workers in Kenya, Uganda, Ghana, Tanzania, and South Africa.

Methods: In this retrospective case-control study, we used data from the Studies of the Epidemiology of Epilepsy in Demographic Sites in Kenya, Uganda, Ghana, Tanzania, and South Africa. We randomly split these individuals using a 7:3 ratio into a training dataset and a validation dataset. We used information gain and correlation-based feature selection to identify eight binary features to predict convulsive seizures. We then assessed several machine-learning algorithms to create a multivariate prediction model. We validated the best-performing model with the internal dataset and a prospectively collected external-validation dataset. We additionally evaluated a leave-one-site-out model (LOSO), in which the model was trained on data from all sites except one that, in turn, formed the validation dataset. We used these features to develop a questionnaire-based predictive panel that we implemented into a multilingual app (the Epilepsy Diagnostic Companion) for health-care workers in each geographical region.

Findings: We analysed epilepsy-specific data from 4097 people, of whom 1985 (48·5%) had convulsive epilepsy, and 2112 were controls. From 170 clinical variables, we initially identified 20 candidate predictor features. Eight features were removed, six because of negligible information gain and two following review by a panel of qualified neurologists. Correlation-based feature selection identified eight variables that demonstrated predictive value; all were associated with an increased risk of an epileptic convulsion except one. The logistic regression, support vector, and naive Bayes models performed similarly, outperforming the decision-tree model. We chose the logistic regression model for its interpretability and implementability. The area under the receiver operator curve (AUC) was 0·92 (95% CI 0·91-0·94, sensitivity 85·0%, specificity 93·7%) in the internal-validation dataset and 0·95 (0·92-0·98, sensitivity 97·5%, specificity 82·4%) in the external-validation dataset. Similar results were observed for the LOSO model (AUC 0·94, 0·93-0·96, sensitivity 88·2%, specificity 95·3%).

Interpretation: On the basis of these findings, we developed the Epilepsy Diagnostic Companion as a predictive model and app offering a validated culture-specific and region-specific solution to confirm the diagnosis of a convulsive epileptic seizure in people with suspected epilepsy. The questionnaire panel is simple and accessible for health-care workers without specialist knowledge to administer. This tool can be iteratively updated and could lead to earlier, more accurate diagnosis of seizures and improve care for people with epilepsy.

Funding: The Wellcome Trust, the UK National Institute of Health Research, and the Oxford NIHR Biomedical Research Centre.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests We declare no competing interests.

Figures

Figure 1
Figure 1. Workflow for data processing and model development
The SEEDs dataset from five unique regions in sub-Saharan Africa (Ghana, Kenya, South Africa, Tanzania, and Uganda) was spliced into two subsets balanced for prevalence of convulsive epilepsy, sex, and geographical site, with n=2875 for model training and n=1222 for validation. The best-performing model was then selected and evaluated using the prospectively collected external-validation dataset from Kilifi, Kenya.
Figure 2
Figure 2. Identification of optimal number of predictive features for a prediction of convulsive epilepsy
Increasing the number of features strengthens the model to a plateau point around eight features (red arrow). The model AUC does not improve significantly beyond eight features (rolling mean p=0·77), even when up to 18 features are added (AUC with eight features 0·93). This demonstrates that adding features does not necessarily result in improved model performance (AUC with 18 features 0·93). Blue indicates 95% CI. AUC=area under the receiver operator curve.
Figure 3
Figure 3. Comparison of model AUC with different machine learning algorithms
Central line indicates mean AUC, lower edge of the box indicates first quartile, upper edge of the box indicates third quartile, lower whisker indicates minimum AUC, and upper whisker indicates maximum AUC. Decision tree, logistic regression, naive Bayes assuming a Bernoulli distribution, and a support-vector machine using a linear kernel are shown. The logistic regression, naive Bayes, and support vector models showed similar results (p=0·67), and outperformed the decision-tree model (p<0·001). AUC=receiver operator characteristic area under the curve.
Figure 4
Figure 4. AUC for the logistic regression model trained to predict convulsive epilepsy
The internal-validation dataset AUC was 0·92 (95% CI 0·91–0·94), the AUC was 0·94 (0·93–0·96), and the external validation dataset AUC was 0·96 (0·93–0·98). AUC=receiver operator characteristic area under the curve. LOSO=leave one site out.

References

    1. WHO. Epilepsy. 2019. [accessed June 1, 2022]. https://www.who.int/news-room/fact-sheets/detail/epilepsy .
    1. Newton CR, Garcia HH. Epilepsy in poor regions of the world. Lancet. 2012;380:1193–201. - PubMed
    1. Thurman DJ, Begley CE, Carpio A, et al. The primary prevention of epilepsy: a report of the Prevention Task Force of the International League Against Epilepsy. Epilepsia. 2018;59:905–14. doi: 10.1111/epi.14068. - DOI - PMC - PubMed
    1. Dorsey ER, Glidden AM, Holloway MR, Birbeck GL, Schwamm LH. Teleneurology and mobile technologies: the future of neurological care. Nat Rev Neurol. 2018;14:285–97. - PubMed
    1. Ngugi AK, Bottomley C, Kleinschmidt I, et al. Prevalence of active convulsive epilepsy in sub-Saharan Africa and associated risk factors: cross-sectional and case-control studies. Lancet Neurol. 2013;12:253–63. doi: 10.1016/S1474-4422(13)70003-6. - DOI - PMC - PubMed

Publication types