An Automated Bioinformatics Pipeline Informing Near-Real-Time Public Health Responses to New HIV Diagnoses in a Statewide HIV Epidemic
- PMID: 36992446
- PMCID: PMC10058263
- DOI: 10.3390/v15030737
An Automated Bioinformatics Pipeline Informing Near-Real-Time Public Health Responses to New HIV Diagnoses in a Statewide HIV Epidemic
Abstract
Molecular HIV cluster data can guide public health responses towards ending the HIV epidemic. Currently, real-time data integration, analysis, and interpretation are challenging, leading to a delayed public health response. We present a comprehensive methodology for addressing these challenges through data integration, analysis, and reporting. We integrated heterogeneous data sources across systems and developed an open-source, automatic bioinformatics pipeline that provides molecular HIV cluster data to inform public health responses to new statewide HIV-1 diagnoses, overcoming data management, computational, and analytical challenges. We demonstrate implementation of this pipeline in a statewide HIV epidemic and use it to compare the impact of specific phylogenetic and distance-only methods and datasets on molecular HIV cluster analyses. The pipeline was applied to 18 monthly datasets generated between January 2020 and June 2022 in Rhode Island, USA, that provide statewide molecular HIV data to support routine public health case management by a multi-disciplinary team. The resulting cluster analyses and near-real-time reporting guided public health actions in 37 phylogenetically clustered cases out of 57 new HIV-1 diagnoses. Of the 37, only 21 (57%) clustered by distance-only methods. Through a unique academic-public health partnership, an automated open-source pipeline was developed and applied to prospective, routine analysis of statewide molecular HIV data in near-real-time. This collaboration informed public health actions to optimize disruption of HIV transmission.
Keywords: HIV transmission networks; contact tracing; molecular HIV clusters; molecular epidemiology; near-real-time data integration; phylogenetics.
Conflict of interest statement
M.H. is currently Sr. Data Scientist at Amazon.com, Inc., but conducted this research prior to starting that role.
Figures



References
-
- Smith D.M., May S., Tweeten S., Drumright L., Pacold M.E., Pond S.L., Pesano R.L., Lie Y.S., Richman D.D., Frost S.D., et al. A public health model for the molecular surveillance of HIV transmission in San Diego, California. AIDS. 2009;23:225–232. doi: 10.1097/QAD.0b013e32831d2a81. - DOI - PMC - PubMed
-
- Kantor R., Fulton J.P., Steingrimsson J., Novitsky V., Howison M., Gillani F., Li Y., Manne A., Parillo Z., Spence M., et al. Challenges in evaluating the use of viral sequence data to identify HIV transmission networks for public health. Stat. Commun. Infect. Dis. 2020;12:20190019. doi: 10.1515/scid-2019-0019. - DOI - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical