Enhancing drug and cell line representations via contrastive learning for improved anti-cancer drug prioritization

Patrick J Lawrence¹, Benjamin Burns², Xia Ning^{3

4

5}

Affiliations

¹ Biomedical Informatics Department, The Ohio State University, 1800 Cannon Drive, Lincoln Tower 250, Columbus, OH, 43210, USA.
² Computer Science and Engineering Department, The Ohio State University, 2015 Neil Avenue, Columbus, OH, 43210, USA.
³ Biomedical Informatics Department, The Ohio State University, 1800 Cannon Drive, Lincoln Tower 250, Columbus, OH, 43210, USA. ning.104@osu.edu.
⁴ Computer Science and Engineering Department, The Ohio State University, 2015 Neil Avenue, Columbus, OH, 43210, USA. ning.104@osu.edu.
⁵ Translational Data Analytics Institute, The Ohio State University, 1760 Neil Avenue, Columbus, OH, 43210, USA. ning.104@osu.edu.

PMID: 38762647
PMCID: PMC11102516
DOI: 10.1038/s41698-024-00589-8

Enhancing drug and cell line representations via contrastive learning for improved anti-cancer drug prioritization

Patrick J Lawrence et al. NPJ Precis Oncol. 2024.

. 2024 May 18;8(1):106.

doi: 10.1038/s41698-024-00589-8.

Authors

Patrick J Lawrence¹, Benjamin Burns², Xia Ning^{3

4

5}

Affiliations

¹ Biomedical Informatics Department, The Ohio State University, 1800 Cannon Drive, Lincoln Tower 250, Columbus, OH, 43210, USA.
² Computer Science and Engineering Department, The Ohio State University, 2015 Neil Avenue, Columbus, OH, 43210, USA.
³ Biomedical Informatics Department, The Ohio State University, 1800 Cannon Drive, Lincoln Tower 250, Columbus, OH, 43210, USA. ning.104@osu.edu.
⁴ Computer Science and Engineering Department, The Ohio State University, 2015 Neil Avenue, Columbus, OH, 43210, USA. ning.104@osu.edu.
⁵ Translational Data Analytics Institute, The Ohio State University, 1760 Neil Avenue, Columbus, OH, 43210, USA. ning.104@osu.edu.

PMID: 38762647
PMCID: PMC11102516
DOI: 10.1038/s41698-024-00589-8

Abstract

Due to cancer's complex nature and variable response to therapy, precision oncology informed by omics sequence analysis has become the current standard of care. However, the amount of data produced for each patient makes it difficult to quickly identify the best treatment regimen. Moreover, limited data availability has hindered computational methods' abilities to learn patterns associated with effective drug-cell line pairs. In this work, we propose the use of contrastive learning to improve learned drug and cell line representations by preserving relationship structures associated with drug mechanisms of action and cell line cancer types. In addition to achieving enhanced performance relative to a state-of-the-art method, we find that classifiers using our learned representations exhibit a more balanced reliance on drug- and cell line-derived features when making predictions. This facilitates more personalized drug prioritizations that are informed by signals related to drug resistance.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1. Model architectures.**
Depicts architectures for components proposed by this work: a Siamese neural network and b SiamCDR. For both, boxes with bold borders and a grey face denote trained components. The input pair in a is either a pair of drugs or pair or cell lines depending on if drug or cell line encoder is being trained. Dashed lines and box borders in b indicate optional components by variation. See respective sections in Methods for complete details.

**Fig. 2. Comparing effective score to model predictions.**
Plots the relationship between CES (continuous effective score) and scores predicted by a DeepDSC, b SiamCDR_RF, c SiamCDR_LR, and d SiamCDR_DNN. Each point represents a drug-cell line pair. Drug-cell line pairs containing drugs recommended in the top-3 for at least 3 cell lines by all 4 models are highlighted with distinct colors and shapes (see legend). Note the scale of predicted score in a is different than b, c. This was done to allow the general trend to be visualized.

**Fig. 3. Scaled drug- and cell line-derived feature importance for model predictions.**
Minmax scaled feature importance (>0.01) for the top-100, non-zero features is plotted in descending order along the x-axis for a DeepDSC, b SiamCDR_RF, c SiamCDR_LR, and d SiamCDR_DNN. In each plot, the average relative feature importance for both drug- and cell line-derived features is plotted with horizonal lines.

**Fig. 4. t-SNE plots for cell line and drug feature representations.**
t-SNE plots were from the embeddings produced by both a DeepDSC’s autoencoder and b SiamCDR’s cell line encoder for cancers with at least 15 cells. Each cancer is represented by a distinct color/shape combination. In b, clusters of single cancer types are highlighted with dotted, black lines; clusters discussed in the Results are highlighted and labeled with distinct colored lines. *Clusters i, ii, and iii* in b are magnified in c, d, and e, respectively. We also produce t-SNE plots for drugs with MOAs with at least 10 drugs in the pre-training data using either f 256-bit Morgan fingerprints (DeepDSC) or g SiamCDR_RF’s drug encoder embeddings. Each MOA is represented by a unique color/shape combination. Clusters discussed in the Results section are highlighted in g and labeled with distinct colored lines. Clusters *iv, v*, and vi in g are magnified in h, i, and j, respectively. The axis scales for all plots have been adjusted to best fit the data. PS protein synthesis, TOP topoisomerase, H2RA histamine receptor antagonist, TP tubulin polymerization, AR adrenergic receptor, COX cyclooxygenase, GR glucocorticoid receptor, AK aurora kinase, -INH inhibitor, -A antagonist, -Ag agonist.

See this image and copyright information in PMC

References

1. Choi J, Park S, Ahn J. RefDNN: a reference drug based neural network for more accurate prediction of anticancer drug resistance. Sci. Rep. 2020;10:1861. doi: 10.1038/s41598-020-58821-x. - DOI - PMC - PubMed
1. Zou H, Hastie T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B: Stat. Methodol. 2005;67:301–320. doi: 10.1111/j.1467-9868.2005.00503.x. - DOI
1. Liu X, Zhang W. A subcomponent-guided deep learning method for interpretable cancer drug response prediction. PLOS Comput. Biol. 2023;19:e1011382. doi: 10.1371/journal.pcbi.1011382. - DOI - PMC - PubMed
1. Li M, et al. DeepDSC: A deep learning method to predict drug sensitivity of cancer cell lines. IEEE/ACM Trans. Comput. Biol. Bioinform. 2021;18:575–582. doi: 10.1109/TCBB.2019.2919581. - DOI - PubMed
1. Morgan HL. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J. Chem. Doc. 1965;5:107–113. doi: 10.1021/c160017a018. - DOI

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Enhancing drug and cell line representations via contrastive learning for improved anti-cancer drug prioritization

Affiliations

Enhancing drug and cell line representations via contrastive learning for improved anti-cancer drug prioritization

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources