Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 May;80(10):4698-704.
doi: 10.1128/JVI.80.10.4698-4704.2006.

A reliable phenotype predictor for human immunodeficiency virus type 1 subtype C based on envelope V3 sequences

Affiliations
Comparative Study

A reliable phenotype predictor for human immunodeficiency virus type 1 subtype C based on envelope V3 sequences

Mark A Jensen et al. J Virol. 2006 May.

Abstract

In human immunodeficiency virus type 1 (HIV-1) subtype B infections, the emergence of viruses able to use CXCR4 as a coreceptor is well documented and associated with accelerated CD4 decline and disease progression. However, in HIV-1 subtype C infections, responsible for more than 50% of global infections, CXCR4 usage is less common, even in individuals with advanced disease. A reliable phenotype prediction method based on genetic sequence analysis could provide a rapid and less expensive approach to identify possible CXCR4 variants and thus increase our understanding of subtype C coreceptor usage. For subtype B V3 loop sequences, genotypic predictors have been developed based on position-specific scoring matrices (PSSM). In this study, we apply this methodology to a training set of 279 subtype C sequences of known phenotypes (228 non-syncytium-inducing [NSI] CCR5(+) and 51 SI CXCR4(+) sequences) to derive a C-PSSM predictor. Specificity and sensitivity distributions were estimated by combining data set bootstrapping with leave-one-out cross-validation, with random sampling of single sequences from individuals on each bootstrap iteration. The C-PSSM had an estimated specificity of 94% (confidence interval [CI], 92% to 96%) and a sensitivity of 75% (CI, 68% to 82%), which is significantly more sensitive than predictions based on other methods, including a commonly used method based on the presence of positively charged residues (sensitivity, 47.8%). A specificity of 83% and a sensitivity of 83% were achieved with a validation set of 24 SI and 47 NSI unique subtype C sequences. The C-PSSM performs as well on subtype C V3 loops as existing subtype B-specific methods do on subtype B V3 loops. We present bioinformatic evidence that particular sites may influence coreceptor usage differently, depending on the subtype.

PubMed Disclaimer

Figures

FIG. 1.
FIG. 1.
Sequence logos of HIV-1 subtype C V3 sequences used in this study. The character and size of each logo represent the proportion of an amino acid at the specific site. The subtype C NSI data set is represented by 228 NSI V3 sequences, and the SI data set corresponds to 51 SI V3 sequences.
FIG. 2.
FIG. 2.
Comparison of C-PSSM score distributions of 229 NSI and 51 SI subtype C sequences. Scores are the median PSSM scores over 100 bootstrapped data sets, as described in Materials and Methods. Box boundaries are interquartile ranges, and central lines are medians over sequences. Error bars extend from the 2.5th and the 97.5th percentiles; beyond these, sequences are represented as outlier points. In a Kruskal-Wallis test, χ2 = 78.9 and P < 10−15.
FIG. 3.
FIG. 3.
Comparison of C-PSSM on subtype C data set with B-PSSM on subtype B data set, using leave-one-out/bootstrap predictions. The boxes show the values described in the legend to Fig. 2.
FIG. 4.
FIG. 4.
ROC analysis. (A) ROCs for C-PSSM and B-PSSM tests on subtype C sequences. x axis, false-positive fractions of true NSI sequences (equivalent to 1 − specificity); y axis, positive fractions of true SI sequences (equivalent to sensitivity). Thicker lines indicate the median positive fraction over 100 bootstrapped data sets (see text), and thinner lines above and below are 97.5th and 2.5th percentiles, respectively. Solid lines, C-PSSM; dotted lines, B-PSSM. (B) Areas under the ROCs over 100 bootstrapped data sets for all combinations of matrices and data.
FIG. 5.
FIG. 5.
Effect of sample size on sensitivity and specificity of C-PSSM. Boxes show the values described in the legend to Fig. 2.
FIG. 6.
FIG. 6.
V3 overlap coefficient P values for subtypes C and B. Dotted lines, P = 0.05; dashed lines, P = 0.01. On the main axis, the V3 site is labeled with subtype C NSI consensus residues. Gray pairs show sites in which the measured OC was significant in one subtype and nonsignificant in the other.

References

    1. Armitage, P., G. Berry, and J. N. S. Matthews. 2002. Statistical methods in medical research, 4th ed. Blackwell Scientific Publications, Oxford, United Kingdom.
    1. Berger, E. A., R. W. Doms, E. M. Fenyo, B. T. Korber, D. R. Littman, J. P. Moore, Q. J. Sattentau, H. Schuitemaker, J. Sodroski, and R. A. Weiss. 1998. A new classification for HIV-1. Nature 391:240. - PubMed
    1. Bjorndal, A., A. Sonnerborg, C. Tscherning, J. Albert, and E. M. Fenyo. 1999. Phenotypic characteristics of human immunodeficiency virus type 1 subtype C isolates of Ethiopian AIDS patients. AIDS Res. Hum. Retrovir. 15:647-653. - PubMed
    1. Briggs, D. R., D. L. Tuttle, J. W. Sleasman, and M. M. Goodenow. 2000. Envelope V3 amino acid sequence predicts HIV-1 phenotype (co-receptor usage and tropism for macrophages). AIDS 14:2937-2939. - PubMed
    1. Brumme, Z. L., W. W. Dong, B. Yip, B. Wynhoven, N. G. Hoffman, R. Swanstrom, M. A. Jensen, J. I. Mullins, R. S. Hogg, J. S. Montaner, and P. R. Harrigan. 2004. Clinical and immunological impact of HIV envelope V3 sequence variation after starting initial triple antiretroviral therapy. AIDS 18:F1-F9. - PubMed

Publication types

Substances