Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan;48(1):56-62.
doi: 10.1097/OLQ.0000000000001264.

Can Machine Learning Help Identify Patients at Risk for Recurrent Sexually Transmitted Infections?

Affiliations

Can Machine Learning Help Identify Patients at Risk for Recurrent Sexually Transmitted Infections?

Heather R Elder et al. Sex Transm Dis. 2021 Jan.

Abstract

Background: A substantial fraction of sexually transmitted infections (STIs) occur in patients who have previously been treated for an STI. We assessed whether routine electronic health record (EHR) data can predict which patients presenting with an incident STI are at greatest risk for additional STIs in the next 1 to 2 years.

Methods: We used structured EHR data on patients 15 years or older who acquired an incident STI diagnosis in 2008 to 2015 in eastern Massachusetts. We applied machine learning algorithms to model risk of acquiring ≥1 or ≥2 additional STIs diagnoses within 365 or 730 days after the initial diagnosis using more than 180 different EHR variables. We performed sensitivity analysis incorporating state health department surveillance data to assess whether improving the accuracy of identifying STI cases improved algorithm performance.

Results: We identified 8723 incident episodes of laboratory-confirmed gonorrhea, chlamydia, or syphilis. Bayesian Additive Regression Trees, the best-performing algorithm of any single method, had a cross-validated area under the receiver operating curve of 0.75. Receiver operating curves for this algorithm showed a poor balance between sensitivity and positive predictive value (PPV). A predictive probability threshold with a sensitivity of 91.5% had a corresponding PPV of 3.9%. A higher threshold with a PPV of 29.5% had a sensitivity of 11.7%. Attempting to improve the classification of patients with and without repeat STIs diagnoses by incorporating health department surveillance data had minimal impact on cross-validated area under the receiver operating curve.

Conclusions: Machine algorithms using structured EHR data did not differentiate well between patients with and without repeat STIs diagnosis. Alternative strategies, able to account for sociobehavioral characteristics, could be explored.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest and Sources of Funding:

None declared.

Figures

Figure 1.
Figure 1.
BART* regression model distribution of predicted risk scores, ≥2 repeat diagnoses of sexually transmitted infections (STIs) within 730 days, Atrius Health, 2008 to 2015. A, Risk score distribution for patients with exactly 2 repeat STIs. B, Risk score distribution for patients with exactly 3 repeat STIs. C, Risk score distribution for patients with exactly 4 repeat STIs. D, Risk score distribution for patients with ≥5 repeat STIs. *BART indicates Bayesian Additive Regression Trees. Defined as positive laboratory result for chlamydia, gonorrhea, or a syphilis diagnosis.

Similar articles

Cited by

References

    1. Center for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2018. Atlanta, GA: US Department of Health and Human Services, 2019.
    1. Overview of Sexually Transmitted Disease Surveillance Data, Massachusetts, 1990–2018. Massachusetts Department of Public Health, Bureau of Infectious Disease and Laboratory Sciences, 2020. Available at: https://www.mass.gov/lists/std-data-and-reports. Accessed May 7, 2020.
    1. Fung M, Scott KC, Kent CK, et al. Chlamydial and gonococcal reinfection among men: A systematic review of data to evaluate the need for retesting. Sex Transm Infect 2007; 83:304–309. - PMC - PubMed
    1. Simms I, Stephenson JM. Pelvic inflammatory disease epidemiology: What do we know and what do we need to know? Sex Transm Infect 2000; 76:80–87. - PMC - PubMed
    1. Hsu KK, Molotnikov LE, Roosevelt KA, et al. Characteristics of cases with repeated sexually transmitted infections, Massachusetts, 2014–2016. Clin Infect Dis 2018; 67:99–104. - PubMed

Publication types