Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 23;23(1):49.
doi: 10.1186/s12879-023-07987-6.

Spatial distribution and machine learning prediction of sexually transmitted infections and associated factors among sexually active men and women in Ethiopia, evidence from EDHS 2016

Affiliations

Spatial distribution and machine learning prediction of sexually transmitted infections and associated factors among sexually active men and women in Ethiopia, evidence from EDHS 2016

Abdul-Aziz Kebede Kassaw et al. BMC Infect Dis. .

Abstract

Introduction: Sexually transmitted infections (STIs) are the major public health problem globally, affecting millions of people every day. The burden is high in the Sub-Saharan region, including Ethiopia. Besides, there is little evidence on the distribution of STIs across Ethiopian regions. Hence, having a better understanding of the infections is of great importance to lessen their burden on society. Therefore, this article aimed to assess predictors of STIs using machine learning techniques and their geographic distribution across Ethiopian regions. Assessing the predictors of STIs and their spatial distribution could help policymakers to understand the problems better and design interventions accordingly.

Methods: A community-based cross-sectional study was conducted from January 18, 2016, to June 27, 2016, using the 2016 Ethiopian Demography and Health Survey (EDHS) dataset. We applied spatial autocorrelation analysis using Global Moran's I statistics to detect latent STI clusters. Spatial scan statics was done to identify local significant clusters based on the Bernoulli model using the SaTScan™ for spatial distribution and Supervised machine learning models such as C5.0 Decision tree, Random Forest, Support Vector Machine, Naïve Bayes, and Logistic regression were applied to the 2016 EDHS dataset for STI prediction and their performances were analyzed. Association rules were done using an unsupervised machine learning algorithm.

Results: The spatial distribution of STI in Ethiopia was clustered across the country with a global Moran's index = 0.06 and p value = 0.04. The Random Forest algorithm was best for STI prediction with 69.48% balanced accuracy and 68.50% area under the curve. The random forest model showed that region, wealth, age category, educational level, age at first sex, working status, marital status, media access, alcohol drinking, chat chewing, and sex of the respondent were the top 11 predictors of STI in Ethiopia.

Conclusion: Applying random forest machine learning algorithm for STI prediction in Ethiopia is the proposed model to identify the predictors of STIs.

Keywords: Ethiopia; Machine learning; Prediction; Sexually transmitted infections; Spatial distribution.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
General information about methodology in machine learning for STI prediction
Fig. 2
Fig. 2
Hotspot analysis of STI among sexually active men and women in Ethiopia, EDHS 2016
Fig. 3
Fig. 3
Sat Scan analysis of STI using the Kuldorff SaTScan approach among sexually active men and women in Ethiopia, EDHS 2016
Fig. 4
Fig. 4
Kriging interpolation of STI among sexually active men and women in Ethiopia, EDHS 2016
Fig. 5
Fig. 5
Important variable selection for STI prediction using Boruta algorithm, EDHS 2016. iSTI  information about STI, nosp  number of sexual partner, Ch  chat chewing, Al  alcohol drinking, agc  age group, ms  marital status, AFS  age at first sex, ws  working status, Ma  media access, Rg  region
Fig. 6
Fig. 6
Performance measurement using AUC on all balanced sampling techniques
Fig. 7
Fig. 7
Variable importance measures of STI determinants in random forest algorithm, evidence from EDHS 2016. Ch  chat chewing, Al  alcohol drinking, agc  age group, ms  marital status, AFS  age at first sex, ws  working status, Ma  media access, Rg  region

Similar articles

Cited by

References

    1. WHO. Sexually transmitted infections (STIs) 2019. https://www.health.ny.gov/diseases/communicable/std/.
    1. WHO. Sexually transmitted infections Europe: WHO; 2021. https://www.euro.who.int/en/health-topics/communicable-diseases/sexually....
    1. Organization WH. Global incidence and prevalence of selected curable sexually transmitted infections-2008: World Health Organization; 2012.
    1. Parekh N, Donohue JM, Corbelli J, Men A, Kelley D, Jarlenski M. Screening for sexually transmitted infections after cervical cancer screening guideline and medicaid policy changes: a population-based analysis. Med Care. 2018;56(7):561–568. doi: 10.1097/MLR.0000000000000925. - DOI - PubMed
    1. Mavragani A, Ochoa G. Infoveillance of infectious diseases in USA: STDs, tuberculosis, and hepatitis. J Big Data. 2018;5(1):1–23. doi: 10.1186/s40537-018-0140-9. - DOI