Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 17;23(1):bbab356.
doi: 10.1093/bib/bbab356.

A cross-study analysis of drug response prediction in cancer cell lines

Affiliations

A cross-study analysis of drug response prediction in cancer cell lines

Fangfang Xia et al. Brief Bioinform. .

Abstract

To enable personalized cancer treatment, machine learning models have been developed to predict drug response as a function of tumor and drug features. However, most algorithm development efforts have relied on cross-validation within a single study to assess model accuracy. While an essential first step, cross-validation within a biological data set typically provides an overly optimistic estimate of the prediction performance on independent test sets. To provide a more rigorous assessment of model generalizability between different studies, we use machine learning to analyze five publicly available cell line-based data sets: National Cancer Institute 60, ancer Therapeutics Response Portal (CTRP), Genomics of Drug Sensitivity in Cancer, Cancer Cell Line Encyclopedia and Genentech Cell Line Screening Initiative (gCSI). Based on observed experimental variability across studies, we explore estimates of prediction upper bounds. We report performance results of a variety of machine learning models, with a multitasking deep neural network achieving the best cross-study generalizability. By multiple measures, models trained on CTRP yield the most accurate predictions on the remaining testing data, and gCSI is the most predictable among the cell line data sets included in this study. With these experiments and further simulations on partial data, two lessons emerge: (1) differences in viability assays can limit model generalizability across studies and (2) drug diversity, more than tumor diversity, is crucial for raising model generalizability in preclinical screening.

Keywords: deep learning; drug response prediction; drug sensitivity; precision oncology.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Fitted dose–response curves from multiple studies. This example shows the dose response of the LOXIMVI melanoma cell line treated with paclitaxel. Curves have been consistently fitted across the studies. Experimental measurements from multiple sources and replicates are not in complete agreement.
Figure 2
Figure 2
Estimating cross-study response variability based on overlapping experimental data. When the same combination of drug and cell line appears in multiple studies, we can use the reported differences to estimate cross-study response variability. Here we map the AUC values from CTRP to CCLE (left) and CTRP (right) in the scatter plots with linear regression fit. Orange dots represent experiments involving common drugs shared by CTRP, CCLE and GDSC, reducing sampling bias in different studies. In both plots, formula image scores are reported separately for the overall fit and the subset with common drugs. Overall, there is greater agreement between CTRP and CCLE (same assay) than between CTRP and GDSC (different assay).
Figure 3
Figure 3
Impact of cell line or drug diversity on model generalizability. Models trained on partial CTRP data are tested on CCLE and GDSC. Shades indicate the SD of the cross-study Rformula image. (A). Models trained with all CTRP drugs and a fraction of cell lines. (B). Models trained with all CTRP cell lines and a fraction of drugs.
Figure 4
Figure 4
An example configuration of the multitasking drug response prediction network (UnoMT). The network predicts a number of cell line properties (tissue category, tumor site, cancer type, gene expression autoencoder) as well as drug properties (target family, drug-likeness score) in addition to drug response.

References

    1. Shoemaker RH. The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer 2006;6(10):813–23. - PubMed
    1. Cortés-Ciriano I, Westen GJP, Bouvier G, et al. Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel. Bioinformatics 2016;32(1):85–95. - PMC - PubMed
    1. Xia F, Shukla M, Brettin T, et al. Predicting tumor cell line response to drug pairs with deep learning. BMC Bioinformatics 2018;19(18):486. - PMC - PubMed
    1. Barretina J, Caponigro G, Stransky N, et al. The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 2012;483(7391):603–7. - PMC - PubMed
    1. Ghandi M, Huang FW, Jané-Valbuena J, et al. Next-generation characterization of the cancer cell line encyclopedia. Nature 2019;569(7757):503–8. - PMC - PubMed

Publication types