Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 May 3;13(1):32.
doi: 10.1186/s12977-016-0266-9.

Utilization of HIV-1 envelope V3 to identify X4- and R5-specific Tat and LTR sequence signatures

Affiliations

Utilization of HIV-1 envelope V3 to identify X4- and R5-specific Tat and LTR sequence signatures

Gregory C Antell et al. Retrovirology. .

Abstract

Background: HIV-1 entry is a receptor-mediated process directed by the interaction of the viral envelope with the host cell CD4 molecule and one of two co-receptors, CCR5 or CXCR4. The amino acid sequence of the third variable (V3) loop of the HIV-1 envelope is highly predictive of co-receptor utilization preference during entry, and machine learning predictive algorithms have been developed to characterize sequences as CCR5-utilizing (R5) or CXCR4-utilizing (X4). It was hypothesized that while the V3 loop is predominantly responsible for determining co-receptor binding, additional components of the HIV-1 genome may contribute to overall viral tropism and display sequence signatures associated with co-receptor utilization.

Results: The accessory protein Tat and the HlV-1 long terminal repeat (LTR) were analyzed with respect to genetic diversity and compared by Jensen-Shannon divergence which resulted in a correlation with both mean genetic diversity as well as the absolute difference in genetic diversity between R5- and X4-genome specific trends. As expected, the V3 domain of the gp120 protein was enriched with statistically divergent positions. Statistically divergent positions were also identified in Tat amino acid sequences within the transactivation and TAR-binding domains, and in nucleotide positions throughout the LTR. We further analyzed LTR sequences for putative transcription factor binding sites using the JASPAR transcription factor binding profile database and found several putative differences in transcription factor binding sites between R5 and X4 HIV-1 genomes, specifically identifying the C/EBP sites I and II, and Sp site III to differ with respect to sequence configuration for R5 and X4 LTRs.

Conclusion: These observations support the hypothesis that co-receptor utilization coincides with specific genetic signatures in HIV-1 Tat and the LTR, likely due to differing transcriptional regulatory mechanisms and selective pressures applied within specific cellular targets during the course of productive HIV-1 infection.

Keywords: Co-receptor; Divergence; Diversity; HIV-1; LTR; Tat; Transcription factor; Tropism; V3; gp120.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
HIV-1 genetic diversity is highly correlated between corresponding positions in R5- and X4-classified gp120, Tat, and LTR sequence populations. The genetic diversity (order = 1) of each position of gp120, Tat, and LTR was calculated according to Eq. 1. The positions were sorted across the x-axis according to the R5 diversity values (red line), with the corresponding X4 positions plotted (blue dots). With this visualization, the vertical distance between the line and the corresponding dot represents the difference in diversity between the R5- and X4-classified sequences at each position. In general, the X4 values were found to cluster around the R5 values, with a slight skew towards less diversity within the X4 population. Spearman’s rank correlation was performed to assess the correlation between R5 and X4 diversity for gp120 (ρ = 0.8678, P = 2.00 × 10−156), Tat (ρ = 0.8873, P = 4.67 × 10−35), and LTR (ρ = 0.7021, P = 4.06 × 10−78). In all cases, R5 and X4 diversity were well-correlated
Fig. 2
Fig. 2
Jensen–Shannon divergence is correlated with both mean genetic diversity and the absolute difference in genetic diversity. The relationship between Jensen–Shannon divergence and genetic diversity (order = 1) in HIV-1 gp120, Tat, and LTR sequences was evaluated using Spearman’s rank correlation. Both the mean diversity of R5- and X4-classified sequences and the absolute difference between R5 and X4 diversity correlated with Jensen–Shannon divergence. This result indicates that large divergence can be a reflection of not only increased amounts of information (as indicated by high mean diversity), but also by the loss of information in one of the two groups (as indicated by the absolute difference in mean diversity)
Fig. 3
Fig. 3
HIV-1 gp120 demonstrates high Jensen–Shannon divergence in regions with high genetic diversity. HIV-1 gp120 sequences were classified as CCR5 (R5) (n = 1681) or CXCR4 (X4) (n = 52) according to the predicted co-receptor usage of the V3 domain Web-PSSM score [17]. a The diversity index at a Hill number of 1 was calculated for each position for both R5 (red) and X4 (blue) gp120 amino acid sequence populations. Diversity values range from 1 to greater than 10, with the variable domains of gp120 displaying the greatest diversity. b The Jensen–Shannon divergence between R5 and X4 gp120 sequence populations was computed for each amino acid position and plotted with a diamond. Statistically divergent positions (P < 0.01) were plotted in red. A Monte Carlo permutation test was performed to iteratively group gp120 sequences into random groups and calculate a distribution of expected Jensen–Shannon divergence values. The full range of this distribution was plotted in light blue with the interquartile range plotted in dark blue. The full range of divergence for randomly generated groups is in close agreement with the combined diversity of the R5 and X4 populations
Fig. 4
Fig. 4
V3 domain of gp120 is enriched with statistically divergent positions. The 10 conserved and variable domains of gp120 were evaluated to determine if any regions were enriched in statistically divergent sites. A hypergeometric test was used to determine enrichment and depletion of statistically divergent positions, using the null hypothesis of equal distribution amongst domains. The V3 loop was identified as being highly enriched (P = 1.74 × 10−11), while the C1 domain (P = 3.03 × 10−4) and C2 domain (P = 1.28 × 10−4) were statistically depleted at P < 0.01 using a Benjamini–Hochberg multiple testing correction
Fig. 5
Fig. 5
Jensen–Shannon divergence identifies positions of differential amino acid usage between R5 and X4 HIV-1 Tat sequences. HIV-1 Tat sequences were sorted into R5 (n = 504) and X4 (n = 31) populations according to the predicted co-receptor usage of the co-linear V3 domain as determined by Web-PSSM score. a The diversity index at order = 1 was calculated for each position for both R5 (red) and X4 (blue) Tat sequence populations. The diversity index between R5 and X4 populations displayed high similarity at nearly all positions, with the second half of Tat displaying higher diversity values overall for both populations. b The Jensen–Shannon divergence between R5 and X4 Tat sequences was computed for each amino acid position and plotted with a diamond. Statistically divergent positions 7, 23, 57, and 60 (P < 0.01) were plotted in red and consensus changes, positions 40 and 67, were plotted in yellow. A Monte Carlo permutation test was performed to iteratively group Tat sequences into random groups and calculate a distribution of expected Jensen–Shannon divergence values. The full range of this distribution was plotted in light blue with the interquartile range plotted in dark blue
Fig. 6
Fig. 6
Statistically divergent positions between X4 and R5 HIV-1 Tat interchange amino acids for those with similar physiochemical properties. The amino acid usage in four HIV-1 Tat amino acid positions (7, 23, 57, and 60) was plotted for both R5 and X4 groups as a stacked bar chart representing the total genetic variation within each population at the respective positions. Amino acids were color coded according to physiochemical property using the following scheme: positively charged (red), negatively charged (blue), polar uncharged (purple), hydrophobic (green), and unclassified (glycine, proline, and cysteine, yellow). The amino acid positions 7, 23, 57, and 60 were selected due to their statistically significant Jensen–Shannon divergence
Fig. 7
Fig. 7
Statistically divergent Tat positions demonstrate reduced diversity within X4-classified sequences. Within HIV-1 Tat, four amino acid positions were identified as having statistically significant Jensen–Shannon divergence: 7, 23, 57, and 60. In all four cases, it was noted that X4-classified variants exhibited a lower overall genetic diversity at an order of 1, largely due to the enhanced presence of the most common variant in the X4 population. This pattern of diminished diversity within X4 in comparison to R5 suggests that a purifying selective force may be present, affecting a subset of HIV-1 Tat variants
Fig. 8
Fig. 8
HIV-1 LTR demonstrates high divergence both upstream and downstream of the transcription start site. HIV-1 long terminal repeat (LTR) sequences were sorted into R5 (n = 615) and X4 (n = 35) populations according to the predicted co-receptor usage of the co-linear V3 region. a The diversity index at order = 1 was calculated for each position for both R5 (red) and X4 (blue) LTR sequence populations, numbered according to the HXB2 reference sequence. b Following the same approach applied for amino acid analysis, Jensen–Shannon divergence between R5 and X4 LTR sequences was computed for each nucleotide position and plotted. Statistically divergent positions were plotted in red and identified throughout the LTR, both upstream and downstream of the transcriptional start site and within transcription factor binding sites. A Monte Carlo permutation simulation was performed to randomly group LTR sequences and calculate a distribution of expected Jensen–Shannon divergence values, with the full range (light blue) and interquartile range (dark blue) of the distribution plotted across each position of the LTR
Fig. 9
Fig. 9
R5 and X4 LTR sequences demonstrate signature enriched nucleotide variants in transcription factor binding. HIV-1 transcription factors that have been confirmed in vitro, C/EBP-II (HXB2 positions 281–289), ATF-CREB (330–337), C/EBP-I (338–349), NF-κB-II (350–359), NF-κB-I (363–373), Sp-III (377–386), Sp-II (388–398), and Sp-I (399–408), as well as the TAR stem loop (454–518), were evaluated to detect enrichment and depletion of nucleotide variants in R5 and X4 sets of aligned LTR sequences using two sample logos. Enriched nucleotides were plotted proportional to the difference between the populations, with the sum of the most differential position plotted on the vertical axis

Similar articles

Cited by

References

    1. Arrildt KT, Joseph SB, Swanstrom R. The HIV-1 env protein: a coat of many colors. Curr HIV/AIDS Rep. 2012;9:52–63. doi: 10.1007/s11904-011-0107-3. - DOI - PMC - PubMed
    1. Sirois S, Sing T, Chou KC. HIV-1 gp120 V3 loop for structure-based drug design. Curr Protein Pept Sci. 2005;6:413–422. doi: 10.2174/138920305774329359. - DOI - PubMed
    1. Javaherian K, Langlois AJ, McDanal C, Ross KL, Eckler LI, Jellis CL, Profy AT, Rusche JR, Bolognesi DP, Putney SD, et al. Principal neutralizing domain of the human immunodeficiency virus type 1 envelope protein. Proc Natl Acad Sci USA. 1989;86:6768–6772. doi: 10.1073/pnas.86.17.6768. - DOI - PMC - PubMed
    1. Goudsmit J, Debouck C, Meloen RH, Smit L, Bakker M, Asher DM, Wolff AV, Gibbs CJ, Jr, Gajdusek DC. Human immunodeficiency virus type 1 neutralization epitope with conserved architecture elicits early type-specific antibodies in experimentally infected chimpanzees. Proc Natl Acad Sci USA. 1988;85:4478–4482. doi: 10.1073/pnas.85.12.4478. - DOI - PMC - PubMed
    1. Sharon M, Kessler N, Levy R, Zolla-Pazner S, Gorlach M, Anglister J. Alternative conformations of HIV-1 V3 loops mimic beta hairpins in chemokines, suggesting a mechanism for coreceptor selectivity. Structure. 2003;11:225–236. doi: 10.1016/S0969-2126(03)00011-X. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources