Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 1;35(7):1812-1819.
doi: 10.1093/molbev/msy016.

HIV-TRACE (TRAnsmission Cluster Engine): a Tool for Large Scale Molecular Epidemiology of HIV-1 and Other Rapidly Evolving Pathogens

Affiliations

HIV-TRACE (TRAnsmission Cluster Engine): a Tool for Large Scale Molecular Epidemiology of HIV-1 and Other Rapidly Evolving Pathogens

Sergei L Kosakovsky Pond et al. Mol Biol Evol. .

Abstract

In modern applications of molecular epidemiology, genetic sequence data are routinely used to identify clusters of transmission in rapidly evolving pathogens, most notably HIV-1. Traditional 'shoe-leather' epidemiology infers transmission clusters by tracing chains of partners sharing epidemiological connections (e.g., sexual contact). Here, we present a computational tool for identifying a molecular transmission analog of such clusters: HIV-TRACE (TRAnsmission Cluster Engine). HIV-TRACE implements an approach inspired by traditional epidemiology, by identifying chains of partners whose viral genetic relatedness imply direct or indirect epidemiological connections. Molecular transmission clusters are constructed using codon-aware pairwise alignment to a reference sequence followed by pairwise genetic distance estimation among all sequences. This approach is computationally tractable and is capable of identifying HIV-1 transmission clusters in large surveillance databases comprising tens or hundreds of thousands of sequences in near real time, that is, on the order of minutes to hours. HIV-TRACE is available at www.hivtrace.org and from www.github.com/veg/hivtrace, along with the accompanying result visualization module from www.github.com/veg/hivtrace-viz. Importantly, the approach underlying HIV-TRACE is not limited to the study of HIV-1 and can be applied to study outbreaks and epidemics of other rapidly evolving pathogens.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1
Fig. 1
A schematic of the HIV-TRACE workflow. For each stage, we show example input and output data, indicate computational complexity, and provide empirical run-times as functions of the number of sequences on the example HIV-1 data sets described in the text. Trend lines show linear fits in the log-log space.
<sc>Fig</sc>. 2
Fig. 2
Effect of genetic distance threshold and ambiguity fraction on network construction. (A) Number of clusters and size of largest cluster across increasing genetic distance thresholds. (B) Number of clusters and size of largest cluster across increasing ambiguity fractions. (C) Largest clusters (≥ 7 nodes) from the San Diego Primary Infection Cohort, inferred with a 0.015 substitutions/site genetic distance threshold and a 0.05 ambiguity fraction on a phylogenetic tree (each cluster has its own color and is shown in bold). (D) Members of the large, artifactual cluster when ambiguity fraction is increased to 1.0 and distances from ambiguities in all sequences are resolved (shown in bold, colored in red). San Diego sequence data are from Little et al. (2014), and phylogeny was inferred using FastTree2 (Price et al. 2010 b).
<sc>Fig</sc>. 3
Fig. 3
Visualization of the San Diego Primary Infection Cohort cluster (Little et al. 2014) using hivtrace-viz. Circles without connections and darker borders represent clusters, and their area is proportional to cluster size. Nine of the clusters have been expanded, showing individual nodes (individuals) and edges (putative transmission links). Nodes and clusters are colored by risk factor (this is user selectable, and is obtained from network annotation data); for clusters, the distribution of risk factors is shown as a pie chart. The shape of individual nodes indicates the gender of the corresponding individual.

References

    1. Aldous JL, Pond SK, Poon A, Jain S, Qin H, Kahn JS, Kitahata M, Rodriguez B, Dennis AM, Boswell SL et al. , . 2012. Characterizing HIV transmission networks across the united states. Clin Infect Dis. 558:1135–1143. - PMC - PubMed
    1. Bartlett SR, Wertheim JO, Bull RA, Matthews GV, Lamoury FM, Scheffler K, Hellard M, Maher L, Dore GJ, Lloyd AR et al. , . 2017. A molecular transmission network of recent hepatitis C infection in people with and without HIV: implications for targeted treatment strategies. J Viral Hepat. 245:404–411. - PMC - PubMed
    1. Campbell EM, Jia H, Shankar A, Hanson D, Luo W, Masciotra S, Owen SM, Oster AM, Galang RR, Spiller MW et al. , . 2017. Detailed transmission network analysis of a large opiate-driven outbreak of HIV infection in the United States. J Infect Dis. 216:1053–1062. - PMC - PubMed
    1. Campbell MS, Mullins JI, Hughes JP, Celum C, Wong KG, Raugi DN, Sorensen S, Stoddard JN, Zhao H, Deng W, Partners in Prevention HSV/HIV Transmission Study Team, et al. 2011. Viral linkage in HIV-1 seroconverters and their partners in an HIV-1 prevention clinical trial. PLoS One 63:e16986. - PMC - PubMed
    1. Chaillon A, Avila-Ríos S, Wertheim JO, Dennis A, García-Morales C, Tapia-Trejo D, Mejía-Villatoro C, Pascale JM, Porras-Cortés G, Quant-Durán CJ, Mesoamerican Project Group, et al. 2017. Identification of major routes of HIV transmission throughout Mesoamerica. Infect Genet Evol. 54:98–107. - PMC - PubMed

Publication types