Can Machine Learning Help Identify Patients at Risk for Recurrent Sexually Transmitted Infections?
- PMID: 32810028
- PMCID: PMC10949112
- DOI: 10.1097/OLQ.0000000000001264
Can Machine Learning Help Identify Patients at Risk for Recurrent Sexually Transmitted Infections?
Abstract
Background: A substantial fraction of sexually transmitted infections (STIs) occur in patients who have previously been treated for an STI. We assessed whether routine electronic health record (EHR) data can predict which patients presenting with an incident STI are at greatest risk for additional STIs in the next 1 to 2 years.
Methods: We used structured EHR data on patients 15 years or older who acquired an incident STI diagnosis in 2008 to 2015 in eastern Massachusetts. We applied machine learning algorithms to model risk of acquiring ≥1 or ≥2 additional STIs diagnoses within 365 or 730 days after the initial diagnosis using more than 180 different EHR variables. We performed sensitivity analysis incorporating state health department surveillance data to assess whether improving the accuracy of identifying STI cases improved algorithm performance.
Results: We identified 8723 incident episodes of laboratory-confirmed gonorrhea, chlamydia, or syphilis. Bayesian Additive Regression Trees, the best-performing algorithm of any single method, had a cross-validated area under the receiver operating curve of 0.75. Receiver operating curves for this algorithm showed a poor balance between sensitivity and positive predictive value (PPV). A predictive probability threshold with a sensitivity of 91.5% had a corresponding PPV of 3.9%. A higher threshold with a PPV of 29.5% had a sensitivity of 11.7%. Attempting to improve the classification of patients with and without repeat STIs diagnoses by incorporating health department surveillance data had minimal impact on cross-validated area under the receiver operating curve.
Conclusions: Machine algorithms using structured EHR data did not differentiate well between patients with and without repeat STIs diagnosis. Alternative strategies, able to account for sociobehavioral characteristics, could be explored.
Conflict of interest statement
Conflict of Interest and Sources of Funding:
None declared.
Figures

Similar articles
-
Web-Based Risk Prediction Tool for an Individual's Risk of HIV and Sexually Transmitted Infections Using Machine Learning Algorithms: Development and External Validation Study.J Med Internet Res. 2022 Aug 25;24(8):e37850. doi: 10.2196/37850. J Med Internet Res. 2022. PMID: 36006685 Free PMC article.
-
Risk of HIV Diagnosis Following Bacterial Sexually Transmitted Infections in Tennessee, 2013-2017.Sex Transm Dis. 2021 Nov 1;48(11):873-880. doi: 10.1097/OLQ.0000000000001440. Sex Transm Dis. 2021. PMID: 33859145 Free PMC article.
-
Sexually Transmitted Infection Co-testing in a Large Urban Emergency Department.West J Emerg Med. 2024 May;25(3):382-388. doi: 10.5811/westjem.18404. West J Emerg Med. 2024. PMID: 38801045 Free PMC article.
-
Diagnosis and Treatment of Sexually Transmitted Infections: A Review.JAMA. 2022 Jan 11;327(2):161-172. doi: 10.1001/jama.2021.23487. JAMA. 2022. PMID: 35015033 Review.
-
100 years of STIs in the UK: a review of national surveillance data.Sex Transm Infect. 2018 Dec;94(8):553-558. doi: 10.1136/sextrans-2017-053273. Epub 2018 Apr 13. Sex Transm Infect. 2018. PMID: 29654061 Review.
Cited by
-
Electronic Health Record-Based Algorithm for Monitoring Respiratory Virus-Like Illness.Emerg Infect Dis. 2024 Jun;30(6):1096-1103. doi: 10.3201/eid3006.230473. Emerg Infect Dis. 2024. PMID: 38781684 Free PMC article. Review.
-
A comprehensive review for machine learning based human papillomavirus detection in forensic identification with multiple medical samples.Front Microbiol. 2023 Jul 17;14:1232295. doi: 10.3389/fmicb.2023.1232295. eCollection 2023. Front Microbiol. 2023. PMID: 37529327 Free PMC article. Review.
-
Artificial intelligence in assisting pathogenic microorganism diagnosis and treatment: a review of infectious skin diseases.Front Microbiol. 2024 Oct 8;15:1467113. doi: 10.3389/fmicb.2024.1467113. eCollection 2024. Front Microbiol. 2024. PMID: 39439939 Free PMC article. Review.
-
Determinants and prediction of Chlamydia trachomatis re-testing and re-infection within 1 year among heterosexuals with chlamydia attending a sexual health clinic.Front Public Health. 2023 Jan 13;10:1031372. doi: 10.3389/fpubh.2022.1031372. eCollection 2022. Front Public Health. 2023. PMID: 36711362 Free PMC article.
-
Web-Based Risk Prediction Tool for an Individual's Risk of HIV and Sexually Transmitted Infections Using Machine Learning Algorithms: Development and External Validation Study.J Med Internet Res. 2022 Aug 25;24(8):e37850. doi: 10.2196/37850. J Med Internet Res. 2022. PMID: 36006685 Free PMC article.
References
-
- Center for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2018. Atlanta, GA: US Department of Health and Human Services, 2019.
-
- Overview of Sexually Transmitted Disease Surveillance Data, Massachusetts, 1990–2018. Massachusetts Department of Public Health, Bureau of Infectious Disease and Laboratory Sciences, 2020. Available at: https://www.mass.gov/lists/std-data-and-reports. Accessed May 7, 2020.
-
- Hsu KK, Molotnikov LE, Roosevelt KA, et al. Characteristics of cases with repeated sexually transmitted infections, Massachusetts, 2014–2016. Clin Infect Dis 2018; 67:99–104. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical