Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 21;19(8):e1011382.
doi: 10.1371/journal.pcbi.1011382. eCollection 2023 Aug.

A subcomponent-guided deep learning method for interpretable cancer drug response prediction

Affiliations

A subcomponent-guided deep learning method for interpretable cancer drug response prediction

Xuan Liu et al. PLoS Comput Biol. .

Abstract

Accurate prediction of cancer drug response (CDR) is a longstanding challenge in modern oncology that underpins personalized treatment. Current computational methods implement CDR prediction by modeling responses between entire drugs and cell lines, without the consideration that response outcomes may primarily attribute to a few finer-level 'subcomponents', such as privileged substructures of the drug or gene signatures of the cancer cell, thus producing predictions that are hard to explain. Herein, we present SubCDR, a subcomponent-guided deep learning method for interpretable CDR prediction, to recognize the most relevant subcomponents driving response outcomes. Technically, SubCDR is built upon a line of deep neural networks that enables a set of functional subcomponents to be extracted from each drug and cell line profile, and breaks the CDR prediction down to identifying pairwise interactions between subcomponents. Such a subcomponent interaction form can offer a traceable path to explicitly indicate which subcomponents contribute more to the response outcome. We verify the superiority of SubCDR over state-of-the-art CDR prediction methods through extensive computational experiments on the GDSC dataset. Crucially, we found many predicted cases that demonstrate the strength of SubCDR in finding the key subcomponents driving responses and exploiting these subcomponents to discover new therapeutic drugs. These results suggest that SubCDR will be highly useful for biomedical researchers, particularly in anti-cancer drug design.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig 1
Fig 1. Overview of the SubCDR.
(1) Extraction of drug subcomponents. The SMILES string is decomposed into a set of fragments using the BRICS algorithm, to obtain substructures (as subcomponents) for the drug, and the GRU layer is adopted to capture the latent features of substructures. (2) Extraction of cell line subcomponents. The transcriptome profile is converted into a set of gene subsets (as subcomponents) according to the CGC classification, and the latent features of gene subsets are learned by the CNN layer. (3) Construction of subcomponent interactions. An interaction map measuring interaction intensity among subcomponents is generated by Eq 5, which is further established as a network. Later, we leverage the GCN layer to learn the representations hidden in the network. (4) Extraction of side information. The side information of drugs and cell lines is acquired from the known CDRs through a singular value decomposition (SVD) algorithm. (5) Predicting CDRs. The side information combined with the learned representations is fed into a decoder, a multi-layer perceptron, to output final response values.
Fig 2
Fig 2. Depiction of the BRICS procedure.
The root (Aspirin) of the tree is the molecule to be split, where the leaves (enclosed by dashed circles) represent the extracted substructures and ‘*’ denotes the dummy atom. At each iteration, the molecule atoms are scanned from left to right according to the SMILES order, extracting a substructure as soon as a breakable bond is found. The process is repeated until the remaining substructures cannot be split further. The dashed bonds with a green highlight are the ones chosen to break using the BRICS rules.
Fig 3
Fig 3. CGC gene classification.
All 656 CGC genes in our work are initially divided into eight gene subsets (i.e., oncogene, tumour suppressor gene (TSG), fusion gene, the gene with unknown function (none), and their four overlaps) according to their role in cancer. Numbers correspond to the number of genes in each of the gene subsets. Note that the specific classification of CGC genes in cell lines differs, depending on the tumour type.
Fig 4
Fig 4. The performances of SubCDR and baseline methods.
(a) RMSE, PCC, and R2 scores of all methods on three scenarios and the independent test. (b) RMSE scores of all methods across different cancer types of cell lines (defined in the TCGA study, up) and target pathway types of drugs (down).
Fig 5
Fig 5. Visualization of subcomponent interactions, where each interaction map is processed by Softmax and generated as a heatmap, the rows/columns on the map denote cell line/drug subcomponents.
(a) Cases of drug Afatinib’s response to cell lines NCI-H1666 and NCI-H1648. (b) Cases of drug Bortezomib’s response to cell lines VCaP and PC-3. (c) Cases of cell line HTB-9’s response to Foretinib, Ipatasertib, and Tamoxifen. (d) Cases of drug Fludarabine and Nelarabine responding to cell lines EM-2, KU812, and LAMA-84. (e) Cases of drug OF-1 and OSI-027 responding to cell line EC-GI-10, and cases of drug AGI-6780 and AZ6102 responding to cell line T47D.
Fig 6
Fig 6. Representative examples illustrating the advantage of SubCDR in discovering new drugs with sensitive responses.
(a) Cases of cell line MV-4–11’s response to drug Fludarabine and Nelarabine. (b) Case of cell line HCC366’s response to drug Gemcitabine.
Fig 7
Fig 7. Performances of SubCDR with its variants (ablation analysis).

References

    1. Adam G, Rampášek L, Safikhani Z, Smirnov P, Haibe-Kains B, Goldenberg A. Machine learning approaches to drug response prediction: challenges and recent progress. NPJ precision oncology. 2020;4(1):1–10. doi: 10.1038/s41698-020-0122-1 - DOI - PMC - PubMed
    1. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483(7391):603–607. doi: 10.1038/nature11003 - DOI - PMC - PubMed
    1. Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, et al. A landscape of pharmacogenomic interactions in cancer. Cell. 2016;166(3):740–754. doi: 10.1016/j.cell.2016.06.017 - DOI - PMC - PubMed
    1. Stanfield Z, Coşkun M, Koyutürk M. Drug response prediction as a link prediction problem. Scientific reports. 2017;7(1):1–13. doi: 10.1038/srep44961 - DOI - PMC - PubMed
    1. Zhang F, Wang M, Xi J, Yang J, Li A. A novel heterogeneous network-based method for drug response prediction in cancer cell lines. Scientific reports. 2018;8(1):1–9. doi: 10.1038/s41598-018-21622-4 - DOI - PMC - PubMed

Publication types

Substances