Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 25;24(8):e37850.
doi: 10.2196/37850.

Web-Based Risk Prediction Tool for an Individual's Risk of HIV and Sexually Transmitted Infections Using Machine Learning Algorithms: Development and External Validation Study

Affiliations

Web-Based Risk Prediction Tool for an Individual's Risk of HIV and Sexually Transmitted Infections Using Machine Learning Algorithms: Development and External Validation Study

Xianglong Xu et al. J Med Internet Res. .

Abstract

Background: HIV and sexually transmitted infections (STIs) are major global public health concerns. Over 1 million curable STIs occur every day among people aged 15 years to 49 years worldwide. Insufficient testing or screening substantially impedes the elimination of HIV and STI transmission.

Objective: The aim of our study was to develop an HIV and STI risk prediction tool using machine learning algorithms.

Methods: We used clinic consultations that tested for HIV and STIs at the Melbourne Sexual Health Centre between March 2, 2015, and December 31, 2018, as the development data set (training and testing data set). We also used 2 external validation data sets, including data from 2019 as external "validation data 1" and data from January 2020 and January 2021 as external "validation data 2." We developed 34 machine learning models to assess the risk of acquiring HIV, syphilis, gonorrhea, and chlamydia. We created an online tool to generate an individual's risk of HIV or an STI.

Results: The important predictors for HIV and STI risk were gender, age, men who reported having sex with men, number of casual sexual partners, and condom use. Our machine learning-based risk prediction tool, named MySTIRisk, performed at an acceptable or excellent level on testing data sets (area under the curve [AUC] for HIV=0.78; AUC for syphilis=0.84; AUC for gonorrhea=0.78; AUC for chlamydia=0.70) and had stable performance on both external validation data from 2019 (AUC for HIV=0.79; AUC for syphilis=0.85; AUC for gonorrhea=0.81; AUC for chlamydia=0.69) and data from 2020-2021 (AUC for HIV=0.71; AUC for syphilis=0.84; AUC for gonorrhea=0.79; AUC for chlamydia=0.69).

Conclusions: Our web-based risk prediction tool could accurately predict the risk of HIV and STIs for clinic attendees using simple self-reported questions. MySTIRisk could serve as an HIV and STI screening tool on clinic websites or digital health platforms to encourage individuals at risk of HIV or an STI to be tested or start HIV pre-exposure prophylaxis. The public can use this tool to assess their risk and then decide if they would attend a clinic for testing. Clinicians or public health workers can use this tool to identify high-risk individuals for further interventions.

Keywords: HIV; algorithm; chlamydia; development; gonorrhea; machine learning; model; prediction; predictive; risk; risk assessment; sexual health; sexual transmission; sexually transmitted; sexually transmitted infections; syphilis; validation; web-based.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Development of machine learning algorithms. The architecture of the gradient boosting machine was adapted from Feng et al [35]. LASSO: least absolute shrinkage and selection operator.
Figure 2
Figure 2
Importance of the top 10 predictors in the prediction of HIV or sexually transmission infections (STIs) using a gradient boosting machine, for detecting (A) HIV, (B) syphilis, (C) gonorrhea, and (D) chlamydia.
Figure 3
Figure 3
Receiver operating characteristic curve performance of the HIV and sexually transmitted infection (STI) risk prediction tool on (A) testing data analysis from 2015-2018, (B) external data validation analysis from 2019, and (C) external data validation analysis from 2020-2021. AUC: area under the curve.
Figure 4
Figure 4
Graphical user interface elements of the HIV and sexually transmitted infection (STI) risk prediction tool, called MySTIRisk. A prototype version of the tool is available at [48]. Machine learning algorithms are used to predict a person’s risk of chlamydia, gonorrhea, syphilis, and HIV.

Similar articles

Cited by

References

    1. Ramchandani MS, Golden MR. Confronting rising STIs in the era of PrEP and treatment as prevention. Curr HIV/AIDS Rep. 2019 Jun;16(3):244–256. doi: 10.1007/s11904-019-00446-5. https://europepmc.org/abstract/MED/31183609 10.1007/s11904-019-00446-5 - DOI - PMC - PubMed
    1. Chow EPF, Grulich AE, Fairley CK. Epidemiology and prevention of sexually transmitted infections in men who have sex with men at risk of HIV. Lancet HIV. 2019 Jun;6(6):e396–e405. doi: 10.1016/S2352-3018(19)30043-8.S2352-3018(19)30043-8 - DOI - PubMed
    1. Report on global sexually transmitted infection surveillance 2018. World Health Organization. 2018. [2019-05-04]. https://apps.who.int/iris/bitstream/handle/10665/277258/9789241565691-en... .
    1. HIV, viral hepatitis and sexually transmissible infections in Australia Annual surveillance report 2021. Kirby Institute. 2021. [2022-04-06]. https://kirby.unsw.edu.au/sites/default/files/kirby/report/Annual-Suveil... .
    1. HIV, viral hepatitis and sexually transmissible infections in Australia: Annual surveillance report 2018. Kirby Institute. 2018. [2019-05-08]. https://kirby.unsw.edu.au/report/hiv-viral-hepatitis-and-sexually-transm... .

Publication types