Using Machine Learning to Identify Patients at Risk of Acquiring HIV in an Urban Health System
- PMID: 39116330
- PMCID: PMC11315401
- DOI: 10.1097/QAI.0000000000003464
Using Machine Learning to Identify Patients at Risk of Acquiring HIV in an Urban Health System
Abstract
Background: Effective measures exist to prevent the spread of HIV. However, the identification of patients who are candidates for these measures can be a challenge. A machine learning model to predict risk for HIV may enhance patient selection for proactive outreach.
Setting: Using data from the electronic health record at Parkland Health, 1 of the largest public healthcare systems in the country, a machine learning model is created to predict incident HIV cases. The study cohort includes any patient aged 16 or older from 2015 to 2019 (n = 458,893).
Methods: Implementing a 70:30 ratio random split of the data into training and validation sets with an incident rate <0.08% and stratified by incidence of HIV, the model is evaluated using a k-fold cross-validated (k = 5) area under the receiver operating characteristic curve leveraging Light Gradient Boosting Machine Algorithm, an ensemble classifier.
Results: The light gradient boosting machine produces the strongest predictive power to identify good candidates for HIV PrEP. A gradient boosting classifier produced the best result with an AUC of 0.88 (95% confidence interval: 0.86 to 0.89) on the training set and 0.85 (95% confidence interval: 0.81 to 0.89) on the validation set for a sensitivity of 77.8% and specificity of 75.1%.
Conclusions: A gradient boosting model using electronic health record data can be used to identify patients at risk of acquiring HIV and implemented in the clinical setting to build outreach for preventative interventions.
Copyright © 2024 The Author(s). Published by Wolters Kluwer Health, Inc.
Conflict of interest statement
A.N. and H.L.K. receive research funding from Gilead Sciences. Authors have no conflict of interest to declare. In addition, the abovementioned contributors in the acknowledgment received no compensation for their work other than their usual salary and have no conflicts of interest relevant to this article.
Figures
References
-
- Centers for Disease Control and Prevention. Diagnoses of HIV infection in the United States and dependent areas, 2021. HIV surveillance report 2023; 34. Published May 23, 2023. Accessed January 22, 2024.
-
- US Preventive Services Task Force. Preexposure prophylaxis for the prevention of HIV infection: US preventive services Task Force recommendation statement. JAMA. 2019;321:2203–2213. - PubMed
-
- Centers for Disease Control and Prevention. Monitoring selected national HIV prevention and care objectives by using HIV surveillance data—United States and 6 dependent areas, 2019. HIV Surveillance Supplemental Report 2021;26(No. 2). Available at: http://www.cdc.gov/hiv/library/reports/hiv-surveillance.html. https://www.cdc.gov/hiv/library/reports/hiv-surveillance/vol-32/content/.... Accessed January 20, 2023.
-
- Babiarz J Nix CD Bowden S, et al. . Insufficient PrEParation: an assessment of primary care prescribing habits and use of pre-exposure prophylaxis in patients at risk of HIV acquisition at a single medical centre. Sex Transm Infect. 2023;99(4):276–278. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
Miscellaneous
