Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jan 15;36(2):416-421.
doi: 10.1093/bioinformatics/btz585.

THETA: a new genotypic approach for predicting HIV-1 CRF02-AG coreceptor usage

Affiliations

THETA: a new genotypic approach for predicting HIV-1 CRF02-AG coreceptor usage

Chloé Dimeglio et al. Bioinformatics. .

Abstract

Motivation: The circulating recombinant form of HIV-1 CRF02-AG is the most frequent non-B subtype in Europe. Anti-HIV therapy and pathophysiological studies on the impact of HIV-1 tropism require genotypic determination of HIV-1 tropism for non-B subtypes. But genotypic approaches based on analysis of the V3 envelope region perform poorly when used to determine the tropism of CRF02-AG. We, therefore, designed an algorithm based on information from the gp120 and gp41 ectodomain that better predicts the tropism of HIV-1 subtype CRF02-AG.

Results: We used a bio-statistical method to identify the genotypic determinants of CRF02-AG coreceptor use. Toulouse HIV Extended Tropism Algorithm (THETA), based on a Least Absolute Shrinkage and Selection Operator method, uses HIV envelope sequence from phenotypically characterized clones. Prediction of R5X4/X4 viruses was 86% sensitive and that of R5 viruses was 89% specific with our model. The overall accuracy of THETA was 88%, making it sufficiently reliable for predicting the tropism of subtype CRF02-AG sequences.

Availability and implementation: Binaries are freely available for download at https://github.com/viro-tls/THETA. It was implemented in Matlab and supported on MS Windows platform. The sequence data used in this work are available from GenBank under the accession numbers MK618182-MK618417.

PubMed Disclaimer

Publication types