Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 2;40(1):btad764.
doi: 10.1093/bioinformatics/btad764.

ELISL: early-late integrated synthetic lethality prediction in cancer

Affiliations

ELISL: early-late integrated synthetic lethality prediction in cancer

Yasin I Tepeli et al. Bioinformatics. .

Abstract

Motivation: Anti-cancer therapies based on synthetic lethality (SL) exploit tumour vulnerabilities for treatment with reduced side effects, by targeting a gene that is jointly essential with another whose function is lost. Computational prediction is key to expedite SL screening, yet existing methods are vulnerable to prevalent selection bias in SL data and reliant on cancer or tissue type-specific omics, which can be scarce. Notably, sequence similarity remains underexplored as a proxy for related gene function and joint essentiality.

Results: We propose ELISL, Early-Late Integrated SL prediction with forest ensembles, using context-free protein sequence embeddings and context-specific omics from cell lines and tissue. Across eight cancer types, ELISL showed superior robustness to selection bias and recovery of known SL genes, as well as promising cross-cancer predictions. Co-occurring mutations in a BRCA gene and ELISL-predicted pairs from the HH, FGF, WNT, or NEIL gene families were associated with longer patient survival times, revealing therapeutic potential.

Availability and implementation: Data: 10.6084/m9.figshare.23607558 & Code: github.com/joanagoncalveslab/ELISL.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
ELISL framework, SL label imbalance, and within-cancer prediction performance. (a) The ELISL framework. (b) Number and ratio of positive and negative samples in the train set for each cancer type. (c) Prediction performance (AUPRC) of SL prediction methods within a cancer type over 10 runs. P: significance of the difference in performance between the best of other models and the best ELISL model over 10 runs (lines between boxes).
Figure 2.
Figure 2.
Impact of gene selection bias on SL prediction performance. (a) Left panels: performance under similar train/test bias (same as in Fig. 1c); right panels: double gene holdout inducing differences in gene selection bias between train and test set. Performance (AUPRC) per cancer type and for 10 runs where each pair of train and test sets does not share any genes. P: significance of the difference between the double holdout performances of the two models that performed best under similar bias. (b) Cross-SL label source. Performance (AUPRC) reported for models trained using labels from one SL source and evaluated on another SL source (10 runs). P: significance of the difference between the best ELISL model and the best of the other models.
Figure 3.
Figure 3.
ELISL-RF SL prediction within/across cancer types and feature contribution. (a) Performance of cancer-specific models and pan-cancer models, measured as average AUPRC over 10 runs. Pan-cancer model performances are reported in a separate row at the bottom, where models are trained on all other cancer types except the one the model is supposed to predict on. (b) Contribution of each data source to the predictions of the ELISL-RF model within the same cancer type.
Figure 4.
Figure 4.
Analysis of top SL gene pairs predicted by ELISL-RF. (a) Top 3 pairs ranked by SL prediction score for BRCA, LUAD, and OV (average across 10 test sets). (b and c) Show results for prediction of unknown gene pairs (not in test sets) using ELISL-RF trained on BRCA data without the survival feature. (b) Distribution of SL scores for unknown pairs compared to known SL and non-SL pairs. Dashed lines denote 5% and 95% percentiles. (c) Prediction scores of ELISL-RF without survival for the top 10 pairs in the BRCA test set and the unknown set. (d) Prediction scores of ELISL-RF without survival for pairs involving BRCA1/2 and HH, FGF, or WNT family members. Bar length denotes average SL score and black line length represents standard deviation for the set of pairs of interest. (e–h) Show differences in survival between patient tumours with and without simultaneous alterations in both families of a gene pair, using Kaplan–Meier curves and Wald test P-values of survival differences based on CoxPH models of co-mutation status adjusted for age, sex, and cancer type. For pairs involving BRCA genes and members of the (e) HH, (f) FGF, (g) WNT, and (h) NEIL families.

References

    1. Ashworth A, Lord CJ.. Synthetic lethal therapies for cancer: what’s next after PARP inhibitors? Nat Rev Clin Oncol 2018;15:564–76. - PubMed
    1. Barretina J, Caponigro G, Stransky N. et al. The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 2012;483:603–7. - PMC - PubMed
    1. Bateman A, Martin MJ, Orchard S. et al. UniProt: the universal protein knowledge base in 2021. Nucleic Acids Res 2021;49:D480–9. - PMC - PubMed
    1. Beenken A, Mohammadi M.. The FGF family: biology, pathophysiology and therapy. Nat Rev Drug Discov 2009;8:235–53. - PMC - PubMed
    1. Benítez-Buelga C, Baquero JM, Vaclova T. et al. Genetic variation in the NEIL2 DNA glycosylase gene is associated with oxidative DNA damage in BRCA2 mutation carriers. Oncotarget 2017;8:114626–36. - PMC - PubMed

Publication types