Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May 19;12(5):560.
doi: 10.3390/v12050560.

Drug Resistance Prediction Using Deep Learning Techniques on HIV-1 Sequence Data

Affiliations

Drug Resistance Prediction Using Deep Learning Techniques on HIV-1 Sequence Data

Margaret C Steiner et al. Viruses. .

Abstract

The fast replication rate and lack of repair mechanisms of human immunodeficiency virus (HIV) contribute to its high mutation frequency, with some mutations resulting in the evolution of resistance to antiretroviral therapies (ART). As such, studying HIV drug resistance allows for real-time evaluation of evolutionary mechanisms. Characterizing the biological process of drug resistance is also critically important for sustained effectiveness of ART. Investigating the link between "black box" deep learning methods applied to this problem and evolutionary principles governing drug resistance has been overlooked to date. Here, we utilized publicly available HIV-1 sequence data and drug resistance assay results for 18 ART drugs to evaluate the performance of three architectures (multilayer perceptron, bidirectional recurrent neural network, and convolutional neural network) for drug resistance prediction, jointly with biological analysis. We identified convolutional neural networks as the best performing architecture and displayed a correspondence between the importance of biologically relevant features in the classifier and overall performance. Our results suggest that the high classification performance of deep learning models is indeed dependent on drug resistance mutations (DRMs). These models heavily weighted several features that are not known DRM locations, indicating the utility of model interpretability to address causal relationships in viral genotype-phenotype data.

Keywords: HIV; HIV drug resistance; antiretroviral therapy; deep learning; machine learning; neural networks.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Overview of deep learning methods.
Figure 2
Figure 2
Receiving operator characteristic (ROC) curves for the Multilayer Perceptron classifier. Horizontal axes are specificity; vertical axes are sensitivity. The five curves represent the results of each cross-validation step. Curves which are closer to a ninety-degree angle in the top left corner represent better performance. Area under the curves is given in Table A4. Abbreviations: FPV = fosamprenavir; ATV = atazanavir; IDV = indinavir; LPV = lopinavir; NFV = nelfinavir; SQV = saquinavir; TPV = tipranavir; DRV = darunavir; 3TC = lamivudine; ABC = abacavir; AZT = azidothymidine; D4T = stavudine; DDI = didanosine; TDF = tenofovir disoproxil fumarate; EFV = efavirenz; NVP = nevirapine; ETR = etravirine; RPV = rilpivirine; PI = protease inhibitor; NRTI = nucleotide reverse transcriptase inhibitor; NNRTI = non-nucleotide reverse transcriptase inhibitor.
Figure 3
Figure 3
ROC curves for the Bidirectional Recurrent Neural Network classifier. Horizontal axes are specificity; vertical axes are sensitivity. The five curves represent the results of each cross-validation step. Curves which are closer to a ninety-degree angle in the top left corner represent better performance. Area under the curves is given in Table A4. Abbreviations: FPV = fosamprenavir; ATV = atazanavir; IDV = indinavir; LPV = lopinavir; NFV = nelfinavir; SQV = saquinavir; TPV = tipranavir; DRV = darunavir; 3TC = lamivudine; ABC = abacavir; AZT = azidothymidine; D4T = stavudine; DDI = didanosine; TDF = tenofovir disoproxil fumarate; EFV = efavirenz; NVP = nevirapine; ETR = etravirine; RPV = rilpivirine; PI = protease inhibitor; NRTI = nucleotide reverse transcriptase inhibitor; NNRTI = non-nucleotide reverse transcriptase inhibitor.
Figure 4
Figure 4
ROC curves for the Convolutional Neural Network classifier. Horizontal axes are specificity; vertical axes are sensitivity. The five curves represent the results of each cross-validation step. Curves which are closer to a ninety-degree angle in the top left corner represent better performance. Area under the curves is given in Table A4. Abbreviations: FPV = fosamprenavir; ATV = atazanavir; IDV = indinavir; LPV = lopinavir; NFV = nelfinavir; SQV = saquinavir; TPV = tipranavir; DRV = darunavir; 3TC = lamivudine; ABC = abacavir; AZT = azidothymidine; D4T = stavudine; DDI = didanosine; TDF = tenofovir disoproxil fumarate; EFV = efavirenz; NVP = nevirapine; ETR = etravirine; RPV = rilpivirine; PI = protease inhibitor; NRTI = nucleotide reverse transcriptase inhibitor; NNRTI = non-nucleotide reverse transcriptase inhibitor.
Figure 5
Figure 5
Annotated feature importance plots for the 20 most important features in Multilayer Perceptron classifiers. Horizontal axes are amino acid positions; vertical axes are feature importance (measured as change in 1-AUC). Abbreviations: FPV = fosamprenavir; ATV = atazanavir; IDV = indinavir; LPV = lopinavir; NFV = nelfinavir; SQV = saquinavir; TPV = tipranavir; DRV = darunavir; 3TC = lamivudine; ABC = abacavir; AZT = azidothymidine; D4T = stavudine; DDI = didanosine; TDF = tenofovir disoproxil fumarate; EFV = efavirenz; NVP = nevirapine; ETR = etravirine; RPV = rilpivirine; PI = protease inhibitor; NRTI = nucleotide reverse transcriptase inhibitor; NNRTI = non-nucleotide reverse transcriptase inhibitor.
Figure 6
Figure 6
Annotated feature importance plots for the 20 most important features in Bidirectional Recurrent Neural Network classifiers. Horizontal axes are amino acid positions; vertical axes are feature importance (measured as change in 1-AUC). Abbreviations: FPV = fosamprenavir; ATV = atazanavir; IDV = indinavir; LPV = lopinavir; NFV = nelfinavir; SQV = saquinavir; TPV = tipranavir; DRV = darunavir; 3TC = lamivudine; ABC = abacavir; AZT = azidothymidine; D4T = stavudine; DDI = didanosine; TDF = tenofovir disoproxil fumarate; EFV = efavirenz; NVP = nevirapine; ETR = etravirine; RPV = rilpivirine; PI = protease inhibitor; NRTI = nucleotide reverse transcriptase inhibitor; NNRTI = non-nucleotide reverse transcriptase inhibitor.
Figure 7
Figure 7
Annotated feature importance plots for the 20 most important features in Convolutional Neural Network classifiers. Horizontal axes are amino acid positions; vertical axes are feature importance (measured as change in 1-AUC). Abbreviations: FPV = fosamprenavir; ATV = atazanavir; IDV = indinavir; LPV = lopinavir; NFV = nelfinavir; SQV = saquinavir; TPV = tipranavir; DRV = darunavir; 3TC = lamivudine; ABC = abacavir; AZT = azidothymidine; D4T = stavudine; DDI = didanosine; TDF = tenofovir disoproxil fumarate; EFV = efavirenz; NVP = nevirapine; ETR = etravirine; RPV = rilpivirine; PI = protease inhibitor; NRTI = nucleotide reverse transcriptase inhibitor; NNRTI = non-nucleotide reverse transcriptase inhibitor.
Figure 8
Figure 8
Annotated phylogenetic trees for each drug class. Resistant sequences are denoted by red labels; non-resistant sequences are denoted by blue labels. Abbreviations: FPV = fosamprenavir; ATV = atazanavir; IDV = indinavir; LPV = lopinavir; NFV = nelfinavir; SQV = saquinavir; TPV = tipranavir; DRV = darunavir; 3TC = lamivudine; ABC = abacavir; AZT = azidothymidine; D4T = stavudine; DDI = didanosine; TDF = tenofovir disoproxil fumarate; EFV = efavirenz; NVP = nevirapine; ETR = etravirine; RPV = rilpivirine; PI = protease inhibitor; NRTI = nucleotide reverse transcriptase inhibitor; NNRTI = non-nucleotide reverse transcriptase inhibitor.
Figure 9
Figure 9
Annotated phylogenetic trees for each dataset. Resistant sequences are denoted by red labels; non-resistant sequences are denoted by blue labels. Abbreviations: FPV = fosamprenavir; ATV = atazanavir; IDV = indinavir; LPV = lopinavir; NFV = nelfinavir; SQV = saquinavir; TPV = tipranavir; DRV = darunavir; 3TC = lamivudine; ABC = abacavir; AZT = azidothymidine; D4T = stavudine; DDI = didanosine; TDF = tenofovir disoproxil fumarate; EFV = efavirenz; NVP = nevirapine; ETR = etravirine; RPV = rilpivirine; PI = protease inhibitor; NRTI = nucleotide reverse transcriptase inhibitor; NNRTI = non-nucleotide reverse transcriptase inhibitor.

Similar articles

Cited by

References

    1. Centers for Disease Control and Prevention . HIV Surveillance Report. Volume 30 Centers for Disease Control and Prevention; Atlanta, GA, USA: 2018.
    1. Wandeler G., Johnson L.F., Egger M. Trends in life expectancy of HIV-positive adults on ART across the globe: Comparisons with general population HHS Public Access. Curr. Opin. HIV AIDS. 2016;11:492–500. doi: 10.1097/COH.0000000000000298. - DOI - PMC - PubMed
    1. Das M., Chu P.L., Santos G.M., Scheer S., Vittinghoff E., McFarland W., Colfax G.N. Decreases in community viral load are accompanied by reductions in new HIV infections in San Francisco. PLoS ONE. 2010;5:e11068. doi: 10.1371/journal.pone.0011068. - DOI - PMC - PubMed
    1. Quinn T.C., Wawer M.J., Sewankambo N., Serwadda D., Li C., Wabwire-Mangen F., Meehan M.O., Lutalo T., Gray R.H. Viral load and heterosexual transmission of human immunodeficiency virus type 1. N. Engl. J. Med. 2000;342:921–929. doi: 10.1056/NEJM200003303421303. - DOI - PubMed
    1. Rambaut A., Posada D., Crandall K.A., Holmes E.C. The causes and consequences of HIV evolution. Nat. Rev. Genet. 2004;5:52–61. doi: 10.1038/nrg1246. - DOI - PubMed

Publication types

Substances