Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 15;6(4):102053.
doi: 10.1016/j.xcrm.2025.102053. Epub 2025 Apr 4.

CAN-Scan: A multi-omic phenotype-driven precision oncology platform identifies prognostic biomarkers of therapy response for colorectal cancer

Affiliations

CAN-Scan: A multi-omic phenotype-driven precision oncology platform identifies prognostic biomarkers of therapy response for colorectal cancer

Shumei Chia et al. Cell Rep Med. .

Abstract

Application of machine learning (ML) on cancer-specific pharmacogenomic datasets shows immense promise for identifying predictive response biomarkers to enable personalized treatment. We introduce CAN-Scan, a precision oncology platform, which applies ML on next-generation pharmacogenomic datasets generated from a freeze-viable biobank of patient-derived primary cell lines (PDCs). These PDCs are screened against 84 Food and Drug Administration (FDA)-approved drugs at clinically relevant doses (Cmax), focusing on colorectal cancer (CRC) as a model system. CAN-Scan uncovers prognostic biomarkers and alternative treatment strategies, particularly for patients unresponsive to first-line chemotherapy. Specifically, it identifies gene expression signatures linked to resistance against 5-fluorouracil (5-FU)-based drugs and a focal copy-number gain on chromosome 7q, harboring critical resistance-associated genes. CAN-Scan-derived response signatures accurately predict clinical outcomes across four independent, ethnically diverse CRC cohorts. Notably, drug-specific ML models reveal regorafenib and vemurafenib as alternative treatments for BRAF-expressing, 5-FU-insensitive CRC. Altogether, this approach demonstrates significant potential in improving biomarker discovery and guiding personalized treatments.

Keywords: 5-FU resistance; PDC; biomarker; chromosome 7 amplification; colorectal cancer; drug screen; head and neck cancer; machine learning; patient-derived cancer models; pharmacogenomics; precision oncology.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
CAN-Scan: A multi-omic, PDC-based, phenotype-driven precision oncology platform for biomarker and target discovery (A) Schematic depicting workflow for CAN-Scan platform. PDCs were characterized for their drug response, WES, and WTS. Machine learning (ML) techniques were utilized to train drug-specific predictive models to correlate molecular signatures with response. (B) Evaluation of prercentage of parental somatic SNV (among CRC driver gene [OncoKB]) captured in paired PDCs that are located along the diagonal. PDCs derived from different lesions obtained from the same individuals are highlighted in green boxes. (C) Frequently mutated SNVs among known cancer drivers (in 42 CRC-PDCs). Alterations include missense variants (green) or stop-gain/nonsense (yellow) mutations. Clinical information (primary or metastatic lesion), tumor staging (stage), site of tumor resection (side: ascending colon [left], transverse or descending colon [right]), iCMS, and MSI/MSS status (MSI/MSS) are included. (D) Comparison of CNA landscape in CRC-PDCs and CRC-TCGA (Spearman’s Rho = 0.63, p value < 0.001). (E) Spearman correlation of paired PDCs and parental tissue based on expression of top 1,000 protein-coding genes. Models derived from different lesions obtained from similar individuals are highlighted (green box). (F) Hierarchical clustering of PDCs based on expression of the top 2,000 highly variable genes (HVGs) (coding and non-coding). (G) Immunofluorescence (IF) staining of OSCC-PDC and CRC-PDC models for expression of various cell states, signaling pathway activities, and cell proliferation. Scale bar, 50 μm.
Figure 2
Figure 2
Therapeutic response of PDC models to FDA-approved compounds screened at Cmax reflects patient outcome (A) Hierarchical clustering of PDCs based on their normalized average cell viability score (ACV). Standard-of-care (SoC) drugs for CRC are highlighted in red. Patients highlighted in green demonstrate better response to regorafenib compared to 5-FU. Molecular targets of drugs (MEK = green; PI3K/TOR = blue; EGFR = purple; microtubule = olive; topoisomerase = red). Unmeasured data point (N.D). (B) Hierarchical clustering of PDCs based on ACV identifies responders, moderate responders, and non-responders to SoC. (C) Pearson correlation coefficient (r) between normalized ACV and patient outcome based on progression-free survival (PFS). Disease-recurrent patients (R) labeled in blue; no evidence of disease (NED) labeled in orange. (D) Clinical response (partial response [PR], progressive disease [PD]) of patients to 5-FU-based treatment (x axis) and ACV of matched CRC-PDCs (y axis). (E) Response of CRC-PDCs to 5-FU, FOLFOX, and FOLFIRI. Student’s t tests were carried out between various groups. ∗p value < 0.05 and ∗∗p value < 0.01. (F) Differential drug response (rel-ACV) between paired CRC-PDCs from primary and metastatic lesions (ACVpri/ACVmet) was log10 transformed. log10(rel-AVC)| > 0.22 (red line).
Figure 3
Figure 3
Correlating gene expression with drug response reveals response biomarkers (A) Clustering of drugs based on Spearman correlation between pathway score (gene set variation analysis [GSVA]) and drug response (ACV). (B) The top 100 Spearman rank genes that correlated with individual drug response (ACV-AUC) were used to identify shared genes among drugs with similar molecular targets. (C) Top 20 enriched Hallmark pathways based on the top 300 genes that negatively correlated with 5-FU ACV ranked based on Spearman correlation. (D and G) Protein-protein interaction (PPI) network of the top 300 genes that (D) negatively or (G) positively correlated with 5-FU ACV, based on Spearman correlation. Each node represents a group of interacting proteins annotated for a specific function in STRING database. (E) The top 13 most significantly enriched Hallmark pathways among the top 300 genes that positively correlated with 5-FU ACV ranked based on Spearman correlation. (F) Spearman correlation between MUC3A expression (MUC3A-GXP) with FOLFOX-ACV in CRC-PDCs (r = 0.49, p value = 0.00048). (H) Response of 5-FU-insensitive (5-FU-Res: as defined by ACV >60%) (top) and 5-FU-sensitive (5-FU-Sen: as defined by ACV < 50%) CRC-PDCs to various pathway inhibitors. (I) Average caspase activity (ACA) score of CRC-PDCs to SN-38 (left) and vemurafenib (right) based on iCMS subtyping. Student’s t tests: ∗p value < 0.05 and ∗∗p value < 0.01. (J) The response (ACV) of CRC-PDCs to SoC (5-FU [left], FOLFOX [middle], and FOLFIRI [right]) based on cluster 1/2A/2B/3/4. Student’s t tests: ∗p value < 0.05 and ∗∗p value < 0.01.
Figure 4
Figure 4
Integrating gene expression with CNA reveals DNA structural alteration response biomarkers against SoC (A) Volcano plot displays the significance (Student’s t test; y axis) and magnitude (x axis) of genomic alteration-drug sensitivity associations based on ACV, between copy-number gain/amplification (left) and deletion (right). (B and C) The response (ACV) of CRC-PDCs to 5-FU and corresponding expression of (B) MUC3A and (C) BRAF was compared across different copy-number statuses, respectively, using ANOVA (5-FU-MUC3A [p value = 0.0038] and 5-FU-BRAF [p value = 0.00591]) and Dunnett tests (∗p value < 0.05 and ∗∗p value < 0.01). (D and E) Evaluating the significance of progression-free survival (PFS) and/or overall survival (OS) of 5-FU-treated patients with CRC in TCGA stratified based on median expression of (D) single or (E) 11 gene markers, using log rank tests. (F) Copy-number plot illustrating variation in CRC-PDC with (CRC2756Pri) and without (CRC2440Pri) chromosome 7 gain. (G) Pairwise Pearson correlation for copy-number alterations within TCGA cohort. (H) Overexpression of BRAF in CRC2440 (left). Dose response of CRC2440 to 5-FU, with and without BRAF overexpression. Cell viability was determined using CellTiter-Glo. Triplicate data, error bars represent mean ± SD. (I and J) Evaluating the significance of OS and PFS of 5-FU-treated patients with CRC stratified based on median expression of WFS1 and GAS8 using log rank tests.
Figure 5
Figure 5
Integrating of multi-omic (mutation and gene expression) molecular signatures using ML approaches to generate models that predict patient response (A) The ML framework includes feature selection, training with linear and non-linear algorithms, and evaluation on alternative datasets. (B and C) Models trained on top 50 spearman rank gene that correlated with 5-FU drug response data measured either at a single (PDC: Cmax; GDSC: IC50) or multiple concentrations (area under the curve [AUC]) using the (B) CRC-PDC or (C) GDSC dataset, respectively, were used to predict a cell viability (CV) for CRC-5-FU responders (R) and non-responders (NR) in TCGA. Student’s t tests were performed. (D) Evaluating progression-free survival (PFS) and/or overall survival (OS) of 5-FU-treated patients with CRC in TCGA stratified based on median predicted CV score, by PDC-5-FU-Cmax 50-gene-based model, using log rank tests. (E) Model performance (Student’s t test; negative log2(p values) trained using various number of genes and drug response (Cmax or AUC) was evaluated. (F) Relapse-free survival (RFS) of 5-FU-treated patients with CRC in the Korean cohort stratified based on predicted CV score (cutoff score 70) (p value ∼0.0085), generated using a 50-gene-based 5-FU-ENET model. (G) RFS of FOLFOX-treated patients with CRC in the Korean cohort stratified based on predicted CV score (cutoff score 60) (p value ∼0.0013), generated using a 100-gene-based FOLFOX-ENET model. (H) RFS of 5-FU-treated patients with CRC in PETACC-307 stratified based on median predicted CV score (p value < 0.05), generated using a 50-gene-based 5-FU-ENET model. (I) Frequency analysis of drug response values (ACV) in PDCs (blue) and GDSC (orange). (J) Spearman correlation between drug response score (ACV) for drugs targeting microtubule and topoisomerase in GDSC or PDC datasets. (K) The R2 value of the best-performing model based on SNV (blue), RNA expression (red), or their combination (yellow) for each drug. (L and M) A combinatorial PDC dataset-based ML model (60-SNV, 300-gene, Cmax concentration) predicted CV for 5-FU-treated patients with CRC in TCGA. (L) Evaluating PFS and/or OS of 5-FU-treated patients with CRC in TCGA stratified based on median predicted CV score by ENET model, by PDC-5-FU-Cmax combined model (60-SNV, 300-gene), using log rank tests. (M) Student’s t tests were carried out between the predicted CV score for both R and NR. (N) Evaluation of model performance (negative log2(p values) trained using various number of genes, SNV, and drug response (Cmax). (O) Hierarchical clustering of patients with colon adenocarcinoma/rectum adenocarcinoma (COAD/READ) in TCGA based on their predicted CV score (SVR-linear model, 100 genes, Cmax response data) for selected drugs (responder: score <50%; non-responder: score >50%). Drugs regorafenib and vemurafenib are indicated by red arrowheads.

References

    1. Sharma A., Cao E.Y., Kumar V., Zhang X., Leong H.S., Wong A.M.L., Ramakrishnan N., Hakimullah M., Teo H.M.V., Chong F.T., et al. Longitudinal single-cell RNA sequencing of patient-derived primary cells reveals drug-induced infidelity in stem cell hierarchy. Nat. Commun. 2018;9:4931. - PMC - PubMed
    1. Iorio F., Knijnenburg T.A., Vis D.J., Bignell G.R., Menden M.P., Schubert M., Aben N., Gonçalves E., Barthorpe S., Lightfoot H., et al. A Landscape of Pharmacogenomic Interactions in Cancer. Cell. 2016;166:740–754. - PMC - PubMed
    1. Ghandi M., Huang F.W., Jané-Valbuena J., Kryukov G.V., Lo C.C., McDonald E.R., 3rd, Barretina J., Gelfand E.T., Bielski C.M., Li H., et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature. 2019;569:503–508. - PMC - PubMed
    1. Rafique R., Islam S.M.R., Kazi J.U. Machine learning in the prediction of cancer therapy. Comput. Struct. Biotechnol. J. 2021;19:4003–4017. - PMC - PubMed
    1. Rees M.G., Seashore-Ludlow B., Cheah J.H., Adams D.J., Price E.V., Gill S., Javaid S., Coletti M.E., Jones V.L., Bodycombe N.E., et al. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat. Chem. Biol. 2016;12:109–116. - PMC - PubMed