. 2022 Mar 29;13(1):1667.

doi: 10.1038/s41467-022-29292-7.

Knowledge graph-based recommendation framework identifies drivers of resistance in EGFR mutant non-small cell lung cancer

Affiliations

¹ Biological Insight Knowledge Graph (BIKG), AI Engineering, R&D IT, AstraZeneca, Cambridge, UK.
² Early Computational Oncology, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge, UK.
³ Bioscience, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge, UK.
⁴ NLP Lab, Enterprise AI Services, IGNITE, AstraZeneca, Cambridge, UK.
⁵ Biological Insight Knowledge Graph (BIKG), AI Engineering, R&D IT, AstraZeneca, Gothenburg, Sweden.
⁶ Data Sciences & Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK.
⁷ Early Computational Oncology, Research and Early Development, Oncology R&D, AstraZeneca, Waltham, MA, USA.
⁸ Biological Insight Knowledge Graph (BIKG), AI Engineering, R&D IT, AstraZeneca, Cambridge, UK. elipapa@alum.mit.edu.
⁹ Early Computational Oncology, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge, UK. krishna.bulusu@astrazeneca.com.

PMID: 35351890
PMCID: PMC8964738
DOI: 10.1038/s41467-022-29292-7

Knowledge graph-based recommendation framework identifies drivers of resistance in EGFR mutant non-small cell lung cancer

Anna Gogleva et al. Nat Commun. 2022.

. 2022 Mar 29;13(1):1667.

doi: 10.1038/s41467-022-29292-7.

Authors

Affiliations

¹ Biological Insight Knowledge Graph (BIKG), AI Engineering, R&D IT, AstraZeneca, Cambridge, UK.
² Early Computational Oncology, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge, UK.
³ Bioscience, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge, UK.
⁴ NLP Lab, Enterprise AI Services, IGNITE, AstraZeneca, Cambridge, UK.
⁵ Biological Insight Knowledge Graph (BIKG), AI Engineering, R&D IT, AstraZeneca, Gothenburg, Sweden.
⁶ Data Sciences & Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK.
⁷ Early Computational Oncology, Research and Early Development, Oncology R&D, AstraZeneca, Waltham, MA, USA.
⁸ Biological Insight Knowledge Graph (BIKG), AI Engineering, R&D IT, AstraZeneca, Cambridge, UK. elipapa@alum.mit.edu.
⁹ Early Computational Oncology, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge, UK. krishna.bulusu@astrazeneca.com.

PMID: 35351890
PMCID: PMC8964738
DOI: 10.1038/s41467-022-29292-7

Abstract

Resistance to EGFR inhibitors (EGFRi) presents a major obstacle in treating non-small cell lung cancer (NSCLC). One of the most exciting new ways to find potential resistance markers involves running functional genetic screens, such as CRISPR, followed by manual triage of significantly enriched genes. This triage process to identify 'high value' hits resulting from the CRISPR screen involves manual curation that requires specialized knowledge and can take even experts several months to comprehensively complete. To find key drivers of resistance faster we build a recommendation system on top of a heterogeneous biomedical knowledge graph integrating pre-clinical, clinical, and literature evidence. The recommender system ranks genes based on trade-offs between diverse types of evidence linking them to potential mechanisms of EGFRi resistance. This unbiased approach identifies 57 resistance markers from >3,000 genes, reducing hit identification time from months to minutes. In addition to reproducing known resistance markers, our method identifies previously unexplored resistance mechanisms that we prospectively validate.

PubMed Disclaimer

Conflict of interest statement

The authors declare the following competing interests. All authors except M.P. and H.T. were full-time employees and shareholders of AstraZeneca at the time of study. M.P. and H.T. were PostDoc Fellows of the AstraZeneca PostDoc program at the time when experiments were completed. J.D.’s current affiliation is Tempus Labs, Cambridge MA. E.P.’s current affiliation is DeepMind, London, UK.

Figures

**Fig. 1. Recommendation system takes into account diverse types of evidence to suggest promising drivers of drug resistance in NSCLC.**
The evidence from specially built knowledge graph, literature, clinical and pre-clinical datasets is aggregated and formalized as objectives in multi-objective optimization (MOO) task. Recommended solutions (genes) represent the optimal trade-offs between the conflicting objectives. A subset of recommended genes is passed for the experimental validation.

**Fig. 2. SkywalkR interactive interface allows users to re-rank CRISPR hits based on various combinations of objectives.**
A On the side bar panel each objective is represented with a slider. Users can decide which objectives to include in the optimization and can also specify direction of optimization (minimize or maximize). B Additional tools to explore the results. Relative view shows profiles of recommended genes. Bar plots demonstrate standardized values across objectives and top recommended genes. Co-occurrence heatmap demonstrates clusters of genes frequently mentioned together in EGFR TKI resistance context.

**Fig. 3. Expert curation of recommended hits along with features used for hit triaging.**
Evaluation of top recommended genes by five independent experts indicates that majority of genes can be classified either as A “known resistance markers” or C “previously unknown/less known, credible hits”. Eight genes were labeled as “previously unknown/less known hits with unclear tractability” by the experts (B). Here, unclear tractability refers to the absence of clear path to validate predictions experimentally. “Previously unknown/less known” is definned within the biological context of this study. Features used to generate predictions: full_screen—overall consistency across all conditions in the CRISPR screen; RNASeq_LFC-log2 fold-change from internal RNASeq study; clinical_ES1, clinical_ES2, clinical_ES3-enrichment scores from clinical studies where resistant patients were compared against responders; lit_EGFR, lit_NSCLC-co-occurrence estimates from the literature; pagerank—“popularity” measure of a node; betweenness—centrality estimate of a node. “Agreement” column indicates the number of experts assigning a certain label to a gene. Size of a bubble reflects value normalized across the full set of features for all genes.

**Fig. 4. Shapley values reflect relative importance of features that differentiate relevant hits from non-relevant.**
We modeled the task as a binary classification problem, where a gene can be labeled as either relevant or not. A list of genes prioritized by domain experts for secondary screens was used as a set of labels. The analysis was run using A a set of default objectives; B similar analysis repeated on the full set of objectives. In both cases features associated with consistency in CRISPR screens appear to have the greatest impact on classification, closely followed by graph-derived features.

**Fig. 5. Validation of resistance genes proposed by recommendation system.**
A Proliferation assays confirm osimertinib sensitivity of EGFR-mutant NSCLC cell lines PC-9, II-18 and HCC827. Data are presented as mean values +/– SD, n = 2 technical replicates, representative data of three biological replicates. B Resistance to osimertinib by target gene KO was measured in flow cytometry-based competition assays. Increase in percentage of KO cells compared to control cells was considered as resistance effect to osimertinib. C, D Confirmation of resistance to osimertinib in *EGFR* mutant NSCLC KO cell line models. Effect of KO of genes (EZH2, KCTD5, NF1 and PTEN) on osimertinib resistance was tested in II-18, PC-9 and HCC827 cell lines. Proliferation of cells in DMSO or osimertinib (50 nM) was tracked over 14 days and percentage of BFP+ KO cells was plotted. E Activation of MET drives resistance to osimertinib in *EGFR* mutant PC-9. Cells were engineered to express the dCas9-VP64 SAM CRISPR for MET activation. For C–E, use of two independent guide RNAs per gene were considered as biological replicates (n = 2), three technical replicates are visualized. Data are presented as mean values +/– SD. F Activation of WWTR1 drives resistance to osimertinib in *EGFR* mutant PC-9 cells as measured in 21 day clonogenic assays. PC-9 cells were engineered to express the dCas9-VP64 SAM CRISPR for WWTR1 activation. Data presented as mean values +/– SD, representative data of two biological replicates, each consisting of technical duplicates.

**Fig. 6. Validation of SRC and EZH2 as resistance drivers proposed by recommendation system.**
A Osimertinib dose–response curve in PC9 parental cells compared to two lines derived to have acquired osimertinib resistance, as measured by Cell Titer Glo (96h treatment). B ECF-506 (SRCi) dose–response curve in PC9 parental vs. resistant lines, as measured by Cell Titer Glo (96 h treatment). Resistant cells were co-treated with 160 nM osimertinib. C Viability assay demonstrating dose-dependent effect of EZH2 inhibition on osimertinib resistance in II-18. OR = osimertinib (Osi) resistant. Data are presented as mean values +/– SD (n = 3) of a typical plot, where the experiment was repeated at least three times.

See this image and copyright information in PMC

References

1. Gogleva, A., Papa, E., Jansson, E. & De Baets, G. Drug discovery as a recommendation problem: Challenges and complexities in biological decisions. In Fifteenth ACM Conference on Recommender Systems, 548–550 (2021).
1. Dias, M. B., Locher, D., Li, M., El-Deredy, W. & Lisboa, P. J. The value of personalised recommender systems to e-business: a case study. In Proceedings of the 2008 ACM conference on Recommender systems, 291–294 (2008).
1. Li, G. & Chen, Q. Exploiting explicit and implicit feedback for personalized ranking. Math. Probl. Eng.2016, 2535329 (2016).
1. Smith B, Linden G. Two decades of recommender systems at amazon. com. IEEE Internet Comput. 2017;21:12–18.
1. Germain, A. & Chakareski, J. Spotify me: Facebook-assisted automatic playlist generation. In 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP), 025–028 (IEEE, 2013).

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases
Research Materials
- Addgene Non-profit plasmid repository
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Knowledge graph-based recommendation framework identifies drivers of resistance in EGFR mutant non-small cell lung cancer

Affiliations

Knowledge graph-based recommendation framework identifies drivers of resistance in EGFR mutant non-small cell lung cancer

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical

Molecular Biology Databases

Research Materials

Miscellaneous