Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Sep 15;35(18):3240-3249.
doi: 10.1093/bioinformatics/btz067.

DeepAMR for predicting co-occurrent resistance of Mycobacterium tuberculosis

Collaborators, Affiliations

DeepAMR for predicting co-occurrent resistance of Mycobacterium tuberculosis

Yang Yang et al. Bioinformatics. .

Abstract

Motivation: Resistance co-occurrence within first-line anti-tuberculosis (TB) drugs is a common phenomenon. Existing methods based on genetic data analysis of Mycobacterium tuberculosis (MTB) have been able to predict resistance of MTB to individual drugs, but have not considered the resistance co-occurrence and cannot capture latent structure of genomic data that corresponds to lineages.

Results: We used a large cohort of TB patients from 16 countries across six continents where whole-genome sequences for each isolate and associated phenotype to anti-TB drugs were obtained using drug susceptibility testing recommended by the World Health Organization. We then proposed an end-to-end multi-task model with deep denoising auto-encoder (DeepAMR) for multiple drug classification and developed DeepAMR_cluster, a clustering variant based on DeepAMR, for learning clusters in latent space of the data. The results showed that DeepAMR outperformed baseline model and four machine learning models with mean AUROC from 94.4% to 98.7% for predicting resistance to four first-line drugs [i.e. isoniazid (INH), ethambutol (EMB), rifampicin (RIF), pyrazinamide (PZA)], multi-drug resistant TB (MDR-TB) and pan-susceptible TB (PANS-TB: MTB that is susceptible to all four first-line anti-TB drugs). In the case of INH, EMB, PZA and MDR-TB, DeepAMR achieved its best mean sensitivity of 94.3%, 91.5%, 87.3% and 96.3%, respectively. While in the case of RIF and PANS-TB, it generated 94.2% and 92.2% sensitivity, which were lower than baseline model by 0.7% and 1.9%, respectively. t-SNE visualization shows that DeepAMR_cluster captures lineage-related clusters in the latent space.

Availability and implementation: The details of source code are provided at http://www.robots.ox.ac.uk/∼davidc/code.php.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Illustration of latent structure using t-SNE: (a) lineage distribution resulted from DeepAMR; (b) phenotype distribution resulted from DeepAMR; (c) lineage distribution resulted from DeepAMR_cluster and (d) predicted clusters resulted from DeepAMR_cluster
Fig. 2.
Fig. 2.
Overview of phenotype of the examined 13 403 MTB isolates. (a) Histogram showing the phenotype of the MTB isolates for each individual anti-TB drug obtained by the drug susceptibility test (up to 11 anti-TB drugs were tested for all isolates). For each drug, the isolates with missing phenotype were excluded. (b) Heatmap visualizing the proportion of pair-wise resistance co-occurrence (non-diagonal) and mono-resistance (diagonal) across anti-TB drugs. The non-diagonal elements correspond to poly-resistant isolates that were resistant to at least two anti-TB drugs. The co-occurrence matrix is symmetric so the upper right half of the graph shows all pair-wise co-occurrence cases
Fig. 3.
Fig. 3.
Ranked SNPs based on permutation feature importance resulting in positive metric with respect to INH, EMB, RIF and PZA, respectively

References

    1. Cheng W., et al. (2010) Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10) , pp. 279–286.
    1. Erhan D., et al. (2009) The difficulty of training deep architectures and the effect of unsupervised pre-training. In: Artificial Intelligence and Statistics, pp. 153–160.
    1. Farhat M.R., et al. (2016) Genetic determinants of drug resistance in Mycobacterium tuberculosis and their diagnostic value. Am. J. Respir. Crit. Care Med., 194, 621–630. - PMC - PubMed
    1. Gisbrecht A., et al. (2015) Parametric nonlinear dimensionality reduction using kernel t-SNE. Neurocomputing, 147, 71–82.
    1. Gönen M. (2014) Coupled dimensionality reduction and classification for supervised and semi-supervised multilabel learning. Pattern Recognit. Lett., 38, 132–141. - PMC - PubMed

Publication types