Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Aug 23:6:31619.
doi: 10.1038/srep31619.

Multitask learning improves prediction of cancer drug sensitivity

Affiliations

Multitask learning improves prediction of cancer drug sensitivity

Han Yuan et al. Sci Rep. .

Abstract

Precision oncology seeks to predict the best therapeutic option for individual patients based on the molecular characteristics of their tumors. To assess the preclinical feasibility of drug sensitivity prediction, several studies have measured drug responses for cytotoxic and targeted therapies across large collections of genomically and transcriptomically characterized cancer cell lines and trained predictive models using standard methods like elastic net regression. Here we use existing drug response data sets to demonstrate that multitask learning across drugs strongly improves the accuracy and interpretability of drug prediction models. Our method uses trace norm regularization with a highly efficient ADMM (alternating direction method of multipliers) optimization algorithm that readily scales to large data sets. We anticipate that our approach will enhance efforts to exploit growing drug response compendia in order to advance personalized therapy.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Multitask learning and single task learning, transductive and inductive learning.
(a) A schematic comparison of single task learning and multitask learning models. (b) Inductive and transductive multitask learning schemes. Training data in both feature and response matrices are colored in blue. Testing data in both feature and response matrices are colored in red.
Figure 2
Figure 2. Label noise assessment.
(a,b) Label noise estimates for each of the metrics using in CCLE (a) and for each data set (b) were computed by training elastic net regression models on the true labels as well as on permuted labels using a bootstrapping procedure. We tested whether the mean squared error (MSE) of the models trained on the true labels significantly outperformed the MSE of the models trained on the permuted labels. For each metric in the CCLE data set (a), the cumulative distribution shows the number of drugs passing different noise thresholds. The activity area has the lowest level of label noise compared to Amax, IC50 and EC50. The number of drugs passing a threshold of P < 0.01 for activity area, Amax, IC50 and EC50 was 22, 22, 19, and 14, respectively. For each data set (b), the cumulative distribution plot shows the proportion of drugs passing different noise thresholds. This analysis found that 91.7%, 59.9%, and 74.5% of drugs in the CCLE (activity area), CTD2 (area under the dose response curve), and NCI60 (−log10(GI50)) data sets had true labels that significantly outperformed the permuted labels in this comparison (P < 0.01, one-sided Wilcoxon rank sum test). (c) Scatter plot of CTD2 drug response dynamic range (interquartile range) versus label noise (−log10(P)), Pearson correlation coefficient = 0.63.
Figure 3
Figure 3. Trace norm multitask learning outperforms single task learning for drug sensitivity prediction.
(a) Performance comparison of elastic net and trace norm models on CCLE data set in a transductive setting. (b,c) Performance comparison of elastic net and trace norm models on CTD2 and NCI60 data sets in transductive and inductive settings.
Figure 4
Figure 4. Trace norm drug prediction models capture information about drug mechanism of action.
(a) Hierarchical clustering elastic net model and trace norm model vectors on the NCI60 data set. Drugs with high label noise (P > 0.01) are marked with an asterisk. (b) Heatmaps of gene ontology (GO) enrichment analysis for top weighted genes in the elastic net and trace norm models. GO enrichment analysis was performed on the 100 highest-weighted features in the trace norm and elastic net weight vectors, and P values for the hypergeometric enrichment tests for selected pathways are shown in the heatmap (−log10P values are plotted). Known drug-pathway associations are shown in black boxes (e.g. PD0325901 and AZD6244 are MEK inhibitors and are known to disrupt the MAP kinase pathway). Significant P values (P < 0.01) are colored orange/red whereas insignificant ones are shown in white/yellow. (c) Heatmap of weight vectors learned by trace norm on the NCI60 data set. Columns (drugs) are in the same order as (b). Genes in the black box are enriched for single-strand break repair, and genes in the brown box are enriched for double-strand break repair and cell redox homeostasis.

References

    1. Garraway L. A. Genomics-driven oncology: framework for an emerging paradigm. J Clin Oncol 31, 1806–1814 (2013). - PubMed
    1. Macconaill L. E. & Garraway L. A. Clinical implications of the cancer genome. J Clin Oncol 28, 5219–5228 (2010). - PMC - PubMed
    1. Collins I. & Workman P. New approaches to molecular cancer therapeutics. Nat Chem Biol 2, 689–700 (2006). - PubMed
    1. Wistuba I. I., Gelovani J. G., Jacoby J. J., Davis S. E. & Herbst R. S. Methodological and practical challenges for personalized cancer therapies. Nat Rev Clin Oncol 8, 135–141 (2011). - PubMed
    1. Simon R. & Roychowdhury S. Implementing personalized cancer genomics in clinical trials. Nat Rev Drug Discov 12, 358–369 (2013). - PubMed

Publication types

Substances