Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Apr 15;6(4):e1000743.
doi: 10.1371/journal.pcbi.1000743.

Prediction of co-receptor usage of HIV-1 from genotype

Affiliations

Prediction of co-receptor usage of HIV-1 from genotype

J Nikolaj Dybowski et al. PLoS Comput Biol. .

Abstract

Human Immunodeficiency Virus 1 uses for entry into host cells a receptor (CD4) and one of two co-receptors (CCR5 or CXCR4). Recently, a new class of antiretroviral drugs has entered clinical practice that specifically bind to the co-receptor CCR5, and thus inhibit virus entry. Accurate prediction of the co-receptor used by the virus in the patient is important as it allows for personalized selection of effective drugs and prognosis of disease progression. We have investigated whether it is possible to predict co-receptor usage accurately by analyzing the amino acid sequence of the main determinant of co-receptor usage, i.e., the third variable loop V3 of the gp120 protein. We developed a two-level machine learning approach that in the first level considers two different properties important for protein-protein binding derived from structural models of V3 and V3 sequences. The second level combines the two predictions of the first level. The two-level method predicts usage of CXCR4 co-receptor for new V3 sequences within seconds, with an area under the ROC curve of 0.937+/-0.004. Moreover, it is relatively robust against insertions and deletions, which frequently occur in V3. The approach could help clinicians to find optimal personalized treatments, and it offers new insights into the molecular basis of co-receptor usage. For instance, it quantifies the importance for co-receptor usage of a pocket that probably is responsible for binding sulfated tyrosine.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Receiver Operating Characteristic (ROC) curves of the two-level random-forest classification approach.
Solid curves: averaged over ten-fold leave-one-patient-out cross-validation with random forests trained on interpolated Kyte-Doolittle hydrophobicity along normalized sequences (green), on electrostatics hull (red), and on probability outputs of the two previous random forests, i.e. second-level classification (blue); error-bars mark 95% confidence. Dashed curve: averages over ten out-of-bag predictions of second-level random forests on the full training set of sequences, disregarding that several sequences may originate from same patient.
Figure 2
Figure 2. 5% most important positions on electrostatics hull for tropism classification by electrostatics based random forest.
The backbone of the template V3 conformation is shown as tube with Cα atoms marked by small beads and some residues numbered for orientation, starting with the N-terminal Cys as residue 1. Points are colored according to the mean electrostatic potential formula image (unit formula image) in the respective tropism class (red, formula image; light red, formula image; white, formula image; light blue, formula image; blue, formula image).
Figure 3
Figure 3. Importance of positions of normalized V3 sequence in random forest classification with Kyte-Doolittle descriptor .
The higher the peak at the respective position, the more important this position for correct classification of sequences with respect to co-receptor tropism. The most important region is around normalized sequence position 12, in agreement with the 11/25 rule. The second most important region around position 8 could be involved in binding of sulfated tyrosine on CCR5 . Along the top axis, reference sequence HXB2 before normalization is given for orientation.
Figure 4
Figure 4. X4 class probabilities for sequences as predicted by the two first-level random forests.
Vertical and horizontal axis give probabilities from electrostatics and hydrophobicity based random forests, respectively. These data points are the input for the second-level learning. Note that the sets of R5-tropic sequences (circles) and X4/R5X4-tropic (crosses) can be separated quite well in the plane spanned by the two descriptors.

References

    1. D'Souza MP, Harden VA. Chemokines and HIV-1 second receptors. Confluence of two fields generates optimism in AIDS research. Nat Med. 1996;2:1293–300. - PubMed
    1. Koot M, Keet IP, Vos AH, de Goede RE, Roos MT, et al. Prognostic value of HIV-1 syncytium-inducing phenotype for rate of CD4+ cell depletion and progression to AIDS. Ann Intern Med. 1993;118:681–8. - PubMed
    1. Dorr P, Westby M, Dobbs S, Griffin P, Irvine B, et al. Maraviroc (UK-427,857), a potent, orally bioavailable, and selective small-molecule inhibitor of chemokine receptor CCR5 with broad-spectrum anti-human immunodeficiency virus type 1 activity. Antimicrob Agents Chemother. 2005;49:4721–32. - PMC - PubMed
    1. Trouplin V, Salvatori F, Cappello F, Obry V, Brelot A, et al. Determination of coreceptor usage of human immunodeficiency virus type 1 from patient plasma samples by using a recombinant phenotypic assay. J Virol. 2001;75:251–9. - PMC - PubMed
    1. Whitcomb JM, Huang W, Fransen S, Limoli K, Toma J, et al. Development and characterization of a novel single-cycle recombinant-virus assay to determine human immunodeficiency virus type 1 coreceptor tropism. Antimicrob Agents Chemother. 2007;51:566–75. - PMC - PubMed

Publication types

MeSH terms