Development and validation of an automated HIV prediction algorithm to identify candidates for pre-exposure prophylaxis: a modelling study
- PMID: 31285182
- PMCID: PMC7522919
- DOI: 10.1016/S2352-3018(19)30139-0
Development and validation of an automated HIV prediction algorithm to identify candidates for pre-exposure prophylaxis: a modelling study
Abstract
Background: HIV pre-exposure prophylaxis (PrEP) is effective but underused, in part because clinicians do not have the tools to identify PrEP candidates. We developed and validated an automated prediction algorithm that uses electronic health record (EHR) data to identify individuals at increased risk for HIV acquisition.
Methods: We used machine learning algorithms to predict incident HIV infections with 180 potential predictors of HIV risk drawn from EHR data from 2007-15 at Atrius Health, an ambulatory group practice in Massachusetts, USA. We included EHRs of all patients aged 15 years or older with at least one clinical encounter during 2007-15. We used ten-fold cross-validated area under the receiver operating characteristic curve (cv-AUC) with 95% CIs to assess the model's performance at identifying individuals with incident HIV and patients independently prescribed PrEP by clinicians. The best-performing model was validated prospectively with 2016 data from Atrius Health and externally with 2011-16 data from Fenway Health, a community health centre specialising in sexual health care in Boston (MA, USA). We calculated HIV risk scores (ie, probability of an incident HIV diagnosis) for every HIV-uninfected patient not on PrEP during 2007-15 at Atrius Health and assessed the distribution of scores for thresholds to determine possible candidates for PrEP in the three study cohorts.
Findings: We included 1 155 966 Atrius Health patients from 2007-15 (150 [<0·1%] patients with incident HIV) in our development cohort, 537 257 Atrius Health patients in 2016 (16 [<0·1%] with incident HIV) in our prospective validation cohort, and 33 404 Fenway Health patients from 2011-16 (423 [1·3%] with incident HIV) in our external validation cohort. The best-performing algorithm was obtained with least absolute shrinkage and selection operator (LASSO) and had a cv-AUC of 0·86 (95% CI 0·82-0·90) for identification of incident HIV infections in the development cohort, 0·91 (0·81-1·00) on prospective validation, and 0·77 (0·74-0·79) on external validation. The LASSO model successfully identified patients independently prescribed PrEP by clinicians at Atrius Health in 2016 (cv-AUC 0·93, 95% CI 0·90-0·96) or Fenway Health (0·79, 0·78-0·80). HIV risk scores increased steeply at the 98th percentile. Using this score as a threshold, we prospectively identified 9515 (1·8%) of 536 384 patients at Atrius Health in 2016 and 4385 (15·3%) of 28 702 Fenway Health patients as potential PrEP candidates.
Interpretation: Automated algorithms can efficiently identify patients at increased risk for HIV acquisition. Integrating these models into EHRs to alert providers about patients who might benefit from PrEP could improve prescribing and prevent new HIV infections.
Funding: Harvard University Center for AIDS Research, Providence/Boston Center for AIDS Research, Rhode Island IDeA-CTR, the National Institute of Mental Health, and the US Centers for Disease Control and Prevention.
Copyright © 2019 Elsevier Ltd. All rights reserved.
Conflict of interest statement
DECLARATION OF INTERESTS
DK has conducted research supported by Gilead Sciences and has received funding to author continuing medical education content for Medscape, MED-IQ, DKBmed, and UptoDate, Inc. KHM has conducted research supported by Gilead Sciences and ViiV and has received funding to author continuing medical education content for UptoDate, Inc. All authors declare that there are no other potential conflicts of interest.
Figures

Comment in
-
Electronic health record tools to catalyse PrEP conversations.Lancet HIV. 2019 Oct;6(10):e644-e645. doi: 10.1016/S2352-3018(19)30194-8. Epub 2019 Jul 5. Lancet HIV. 2019. PMID: 31285180 Free PMC article. No abstract available.
Similar articles
-
Using electronic health records to identify candidates for human immunodeficiency virus pre-exposure prophylaxis: An application of super learning to risk prediction when the outcome is rare.Stat Med. 2020 Oct 15;39(23):3059-3073. doi: 10.1002/sim.8591. Epub 2020 Jun 24. Stat Med. 2020. PMID: 32578905 Free PMC article.
-
Use of electronic health record data and machine learning to identify candidates for HIV pre-exposure prophylaxis: a modelling study.Lancet HIV. 2019 Oct;6(10):e688-e695. doi: 10.1016/S2352-3018(19)30137-7. Epub 2019 Jul 5. Lancet HIV. 2019. PMID: 31285183 Free PMC article.
-
How to Identify Potential Candidates for HIV Pre-Exposure Prophylaxis: An AI Algorithm Reusing Real-World Hospital Data.Stud Health Technol Inform. 2021 May 27;281:714-718. doi: 10.3233/SHTI210265. Stud Health Technol Inform. 2021. PMID: 34042669
-
The Pharmacist's Expanding Role in HIV Pre-Exposure Prophylaxis.AIDS Patient Care STDS. 2019 May;33(5):207-213. doi: 10.1089/apc.2018.0294. AIDS Patient Care STDS. 2019. PMID: 31067124 Review.
-
Preexposure Prophylaxis for the Prevention of HIV Infection: Evidence Report and Systematic Review for the US Preventive Services Task Force.JAMA. 2019 Jun 11;321(22):2214-2230. doi: 10.1001/jama.2019.2591. JAMA. 2019. PMID: 31184746
Cited by
-
Application of machine learning algorithms in predicting HIV infection among men who have sex with men: Model development and validation.Front Public Health. 2022 Aug 25;10:967681. doi: 10.3389/fpubh.2022.967681. eCollection 2022. Front Public Health. 2022. PMID: 36091522 Free PMC article.
-
Machine learning outperformed logistic regression classification even with limit sample size: A model to predict pediatric HIV mortality and clinical progression to AIDS.PLoS One. 2022 Oct 14;17(10):e0276116. doi: 10.1371/journal.pone.0276116. eCollection 2022. PLoS One. 2022. PMID: 36240212 Free PMC article.
-
Development of an electronic health record-based Human Immunodeficiency Virus (HIV) risk prediction model for women, incorporating social determinants of health.BMC Public Health. 2025 Jul 2;25(1):2257. doi: 10.1186/s12889-025-23460-2. BMC Public Health. 2025. PMID: 40604593 Free PMC article.
-
Pre-exposure Prophylaxis Persistence Is a Critical Issue in PrEP Implementation.Clin Infect Dis. 2020 Jul 27;71(3):583-585. doi: 10.1093/cid/ciz896. Clin Infect Dis. 2020. PMID: 31509603 Free PMC article. No abstract available.
-
Using electronic health records to identify candidates for human immunodeficiency virus pre-exposure prophylaxis: An application of super learning to risk prediction when the outcome is rare.Stat Med. 2020 Oct 15;39(23):3059-3073. doi: 10.1002/sim.8591. Epub 2020 Jun 24. Stat Med. 2020. PMID: 32578905 Free PMC article.
References
-
- Thigpen MC, Kebaabetswe PM, Paxton LA, et al. Antiretroviral preexposure prophylaxis for heterosexual HIV transmission in Botswana. N Engl J Med 2012; 367(5): 423–34. - PubMed
-
- Choopanya K, Martin M, Suntharasamai P, et al. Antiretroviral prophylaxis for HIV infection in injecting drug users in Bangkok, Thailand (the Bangkok Tenofovir Study): a randomised, double-blind, placebo-controlled phase 3 trial. Lancet 2013; 381(9883): 2083–90. - PubMed
-
- US Public Health Service. Preexposure Prophylaxis for the Prevention of HIV Infection in the United States - 2014. A Clinical Practice Guideline. Accessed at: https://www.cdc.gov/hiv/pdf/prepguidelines2014.pdf on January 18, 2019.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
Miscellaneous