Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec 28;12(12):e0189999.
doi: 10.1371/journal.pone.0189999. eCollection 2017.

HIV-1 envelope sequence-based diversity measures for identifying recent infections

Affiliations

HIV-1 envelope sequence-based diversity measures for identifying recent infections

Alexis Kafando et al. PLoS One. .

Abstract

Identifying recent HIV-1 infections is crucial for monitoring HIV-1 incidence and optimizing public health prevention efforts. To identify recent HIV-1 infections, we evaluated and compared the performance of 4 sequence-based diversity measures including percent diversity, percent complexity, Shannon entropy and number of haplotypes targeting 13 genetic segments within the env gene of HIV-1. A total of 597 diagnostic samples obtained in 2013 and 2015 from recently and chronically HIV-1 infected individuals were selected. From the selected samples, 249 (134 from recent versus 115 from chronic infections) env coding regions, including V1-C5 of gp120 and the gp41 ectodomain of HIV-1, were successfully amplified and sequenced by next generation sequencing (NGS) using the Illumina MiSeq platform. The ability of the four sequence-based diversity measures to correctly identify recent HIV infections was evaluated using the frequency distribution curves, median and interquartile range and area under the curve (AUC) of the receiver operating characteristic (ROC). Comparing the median and interquartile range and evaluating the frequency distribution curves associated with the 4 sequence-based diversity measures, we observed that the percent diversity, number of haplotypes and Shannon entropy demonstrated significant potential to discriminate recent from chronic infections (p<0.0001). Using the AUC of ROC analysis, only the Shannon entropy measure within three HIV-1 env segments could accurately identify recent infections at a satisfactory level. The env segments were gp120 C2_1 (AUC = 0.806), gp120 C2_3 (AUC = 0.805) and gp120 V3 (AUC = 0.812). Our results clearly indicate that the Shannon entropy measure represents a useful tool for predicting HIV-1 infection recency.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Schematic figure showing all env segments used for diversity estimates.
Segments length corresponds to that of strain HXB2 of HIV-1 nucleotides positions. Segments used are denoted by asterisks. Env domain abbreviations: SP, signal peptide; C1–C5, conserved domains 1 to 5; V1–V5, variable domains 1 to 5; FP, fusion peptide; HR1, heptad repeat 1 (NHR); DL, disulfide loop; HR2, heptad repeat 2 (CHR); MPER, membrane proximal ectodomain region; TM, transmembrane domain; CD, cytoplasmic domain. Image were friendly adapted from Michael Caffrey[61]; Trends in Microbiology, Volume 19, Issue 4, Pages 191–197 (April 2011) 10.1016/j.tim.2011.02.001.
Fig 2
Fig 2. Number of sequences (one per patient) used in this study.
N = 249 derived from 134 recently versus 115 chronically HIV-1 infected individual’s sequences data were included in the study.
Fig 3
Fig 3. Frequency polygons (ggplot2) of percent complexity of env sequences of recent HIV-1 infected individuals compare to chronically infected ones by env segments.
The Y axis represents the density of observations (frequency) and the X axis the percent complexity distribution as sequence-based diversity measure. The blue color represents plot and distribution for recent HIV-1 infected population and the red color plot and distribution for chronic infected ones.
Fig 4
Fig 4. Frequency polygons (ggplot2) of percent diversity of env sequences of recent HIV-1 infected individuals compare to chronically infected ones by env segments.
The Y axis represents the density of observations (frequency) and the X axis the percent diversity distribution as sequence-based diversity measure. The blue color represents plot and distribution for recent HIV-1 infected population and the red color plot and distribution for chronic infected ones.
Fig 5
Fig 5. Frequency polygons (ggplot2) of number of haplotypes of env sequences of recent HIV-1 infected individuals compare to chronically infected ones by env segments.
The Y axis represents the density of observations (frequency) and the X axis the number of Haplotypes distribution as sequence-based diversity measure. The blue color represents plot and distribution for recent HIV-1 infected population and the red color plot and distribution for chronic infected ones.
Fig 6
Fig 6. Frequency polygons (ggplot2) of Shannon entropy index of env sequences of recent HIV-1 infected individuals compare to chronically infected ones by env segments.
The Y axis represents the density of observations (frequency) and the X axis the Shannon entropy index distribution as sequence-based diversity measure. The blue color represents plot and distribution for recent HIV-1 infected population and the red color plot and distribution for chronic infected ones.
Fig 7
Fig 7. ROC curves comparing the performance of the 4 sequence-based diversity measures for discriminating recent from chronic HIV-1 infection.
Four selected HIV-1 gp120 conserved subdomains (C2, C3, C4 and C5) subdivided on seven segments were analyzed, 3 segments on the gp120-C2 region (C2_1; C2_2 and C2_3), 2 segments on the gp120-C3 region (C3_1 and C3_2), 1 segment on gp120-C4 and 1 segment on gp120-C5. The Y axis represents the proportion of sequences from true recent HIV-1 infected individuals (sensitivity), and the X axis the proportion of recent HIV-1 infected individuals who were incorrectly classified (1-specificity). ROC = receiver operating characteristics. AUC (area under the curve) values between 0.8 and 1 were considered performance measures.
Fig 8
Fig 8. ROC curves comparing the performance of the 4 sequence-based diversity measures for discriminating recent from chronic HIV-1 infection.
Five HIV-1 gp120 variable loops and one part of gp41 ectodomain (NHR). Five segments represented each of the HIV-1 gp120 variable loop as well as 1 segment of the gp41- NHR ectodomain were analyzed: gp120-V1 loop, gp120-V2 loop, gp120-V3 loop, gp120-V4 loop, gp120-V5 loop and part of the gp41-NHR ectodomain. The Y axis represents the proportion of sequences from true recent HIV-1 infected individuals (sensitivity), and the X axis represents the proportion of recent HIV-1 infected individuals who were incorrectly classified (1-specificity). ROC = receiver operating characteristics. NHR = N-terminal heptad repeat. AUC values between 0.8 and 1 were considered performance measures.
Fig 9
Fig 9. ROC curves comparing the predictive performance of different combinations of sequence-based diversity measures of HIV-1 gp120 conserved subdomains to identify HIV-1 infection recency.
Five combinations of sequence-based diversity measures were analyzed. Shannon entropy + percent diversity + percent complexity: P1; percent diversity+ number of haplotypes+ percent complexity: P2; number of haplotypes+ percent complexity: P3; Shannon entropy+ percent complexity: P4 and percent diversity+ percent complexity: P5. Seven HIV-1 env segments were considered: gp120-C2_1; gp120-C2_2; gp120-C2_3; gp120-C3_1; gp120-C3_2; gp120-C4 and gp120-C5. ROC = receiver operating characteristics; AUC = area under the curve. AUC values between 0.8 and 1 were considered performance measures.
Fig 10
Fig 10. ROC curves comparing the predictive performance of different combinations of sequence-based diversity measures of five HIV-1 env gp120 variable loops and one part of the gp41-ectodomain (NHR) to identify HIV infection recency.
Five combinations of sequence-based diversity measures were analyzed. Shannon entropy + percent diversity + percent complexity: P1; percent diversity+ number of haplotypes+ percent complexity: P2; number of haplotypes+ percent complexity: P3; Shannon entropy+ percent complexity: P4 and percent diversity+ percent complexity: P5. Six HIV-1 env segments were considered: gp120-V1 loop; gp120-V2 loop; gp120-V3 loop; gp120-V4 loop; gp120-V5 loop and, gp41-NHR (partial ectodomain). NHR = N-terminal heptad repeat. ROC = receiver operating characteristics; AUC = area under the curve. AUC values between 0.8 and 1 were considered performance measures.
Fig 11
Fig 11. ROC curves comparing the predictive performance of different combinations of sequence-based diversity measures of HIV gp120-C2-1, gp120-C2_3 and gp120-V3 segments for identifying HIV-1 subtype B infection recency.
Five combinations of sequence-based diversity measures were analyzed: P1, percent complexity; P2, percent diversity; P3, number of haplotypes; P4, Shannon entropy; P5, Shannon entropy+ percent diversity and P6, Number of haplotypes+ percent diversity. Three HIV-1 env segments were considered: gp120-C2_1, gp120- C2_3 and gp120-V3 -. ROC = receiver operating characteristics; AUC = area under the curve. AUC values between 0.8 and 1 were considered performance measures.

Similar articles

Cited by

References

    1. Public Health agency of Canada. HIV and AIDS in Canada, surveillance report to December 31, 2014. http://healthycanadians.gc.ca/publications/diseases-conditions-maladies-....
    1. Hollingsworth TD, Anderson RM, Fraser C. HIV-1 transmission, by stage of infection. Journal of Infectious Diseases. 2008;198(5):687–93. doi: 10.1086/590501 - DOI - PubMed
    1. van Sighem A, Nakagawa F, De Angelis D, Quinten C, Bezemer D, de Coul EO, et al. Estimating HIV Incidence, Time to Diagnosis, and the Undiagnosed HIV Epidemic Using Routine Surveillance Data. Epidemiology. 2015;26(5):653–60. doi: 10.1097/EDE.0000000000000324 - DOI - PMC - PubMed
    1. Mastro TD. Determining HIV Incidence in Populations: Moving in the Right Direction. Journal of Infectious Diseases. 2013;207(2):204–6. doi: 10.1093/infdis/jis661 - DOI - PubMed
    1. Smith MK, Rutstein SE, Powers KA, Fidler S, Miller WC, Eron JJ Jr., et al. The Detection and Management of Early HIV Infection: A Clinical and Public Health Emergency. Jaids-Journal of Acquired Immune Deficiency Syndromes. 2013;63:S187–S99. - PMC - PubMed

Publication types