Data mining for mutation-specific targets in acute myeloid leukemia

Brooks Benard¹, Andrew J Gentles², Thomas Köhnke¹, Ravindra Majeti³, Daniel Thomas⁴

Affiliations

¹ Department of Medicine, Division of Hematology, Cancer Institute, and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA, USA.
² Departments of Medicine (Biomedical Informatics), and Biomedical Data Sciences, Stanford University School of Medicine, Stanford, CA, USA.
³ Department of Medicine, Division of Hematology, Cancer Institute, and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA, USA. rmajeti@stanford.edu.
⁴ Department of Medicine, Division of Hematology, Cancer Institute, and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA, USA. dthomas3@stanford.edu.

PMID: 30728456
PMCID: PMC7250137
DOI: 10.1038/s41375-019-0387-y

Review

Data mining for mutation-specific targets in acute myeloid leukemia

Brooks Benard et al. Leukemia. 2019 Apr.

. 2019 Apr;33(4):826-843.

doi: 10.1038/s41375-019-0387-y. Epub 2019 Feb 6.

Authors

Brooks Benard¹, Andrew J Gentles², Thomas Köhnke¹, Ravindra Majeti³, Daniel Thomas⁴

Affiliations

¹ Department of Medicine, Division of Hematology, Cancer Institute, and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA, USA.
² Departments of Medicine (Biomedical Informatics), and Biomedical Data Sciences, Stanford University School of Medicine, Stanford, CA, USA.
³ Department of Medicine, Division of Hematology, Cancer Institute, and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA, USA. rmajeti@stanford.edu.
⁴ Department of Medicine, Division of Hematology, Cancer Institute, and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA, USA. dthomas3@stanford.edu.

PMID: 30728456
PMCID: PMC7250137
DOI: 10.1038/s41375-019-0387-y

Abstract

Three mutation-specific targeted therapies have recently been approved by the FDA for the treatment of acute myeloid leukemia (AML): midostaurin for FLT3 mutations, enasidenib for relapsed or refractory cases with IDH2 mutations, and ivosidenib for cases with an IDH1 mutation. Together, these agents offer a mutation-directed treatment approach for up to 45% of de novo adult AML cases, a welcome deluge after a prolonged drought. At the same time, a number of computational tools have recently been developed that promise to further accelerate progress in mutation-specific therapy for AML and other cancers. Technical advances together with comprehensively annotated AML tissue banks have resulted in the availability of large and complex data sets for exploration by the end-user, including (i) microarray gene expression, (ii) exome sequencing, (iii) deep sequencing data of sub-clone heterogeneity, (iv) RNA sequencing of gene expression (bulk and single cell), (v) DNA methylation and chromatin, (vi) and germline quantitative trait loci. Yet few clinicians or experimental hematologists have the time or the training to access or analyze these repositories. This review summarizes the data sets and bioinformatic tools currently available to further the discovery of mutation-specific targets with an emphasis on web-based applications that are open, accessible, user-friendly, and do not require coding experience to navigate. We show examples of how available data can be mined to identify potential targets using synthetic lethality, drug repurposing, epigenetic sub-grouping, and proteomic networks while also highlighting strengths and limitations and the need for superior models for validation.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest:

The authors declare no conflict of interest.

Figures

**Fig 1.. Mathematical approaches for discovering novel gene-gene and gene-drug therapeutic vulnerabilities in cancer.**
(a) Pairwise correlation relationships and regression analysis can be used to determine if there is a significant linear relationship between two variables (e.g. expression of gene A vs. gene B may indicate both are in the same cell-type or cellular process); (b) Nonlinear regression analysis of biological data, such as inhibitory concentration at 50% of maximal effect (IC50) dose-response curves, can be identified from large drug/small molecule screens; note some effective hits may not show changes in IC50, but may show important differences in other end-points. (c) Unusual non-normal distributions of certain variables such as bi-modality can be used to predict synthetic lethal gene pairs. (d) Boolean implications resembling IF-THEN rules can be inferred from large data sets that represent unique mutual exclusivity or subset relationships. Tools like STEPMiner can be used to binarize complex variables. (e) Unsupervised hierarchical clustering shown as either a dendrogram and heat-map (upper panel) or an icicle plot (as in the MiSTIC interface, lower panel) can show closely related samples not otherwise apparent from clinical or pathological parameters. Variables can be calculated to be closely related based on different distance metrics (e.g. Euclidian, Manhattan) and clinically actionable sub-groups can be recognized by common patterns. (f) Networks of experimentally validated or bioinformatically predicted “nodes” can be used in machine learning and data mining approaches by applying decision tree learning as a predictive modeling to inferring novel interaction networks, critical nodes in a pathway, and drug repurposing.

**Fig 2.. Algorithm schematics for recent computational tools to predict mutation-specific synthetic lethal interactions.**
(a) DAISY uses three different data inputs in parallel to predict synthetic lethal interactions: (i) overlap of somatic copy number alterations and somatic mutation, (ii) results of essentiality from large scale cell-line shRNA screen results, and (iii) pairwise gene co-expression based on Pearson’s correlation score. The overlap between these three orthogonal methods constructs a network of gene pairs for a given cancer that can then be tested in experimental systems. (b) MiSL uses data exclusively from primary cell patient samples across multiple cancer types to find a set of synthetic lethal pairs for a given mutation and a given cancer. The mutation must be recurrent in some pan-cancer samples. Data inputs are (i) copy number, (ii) somatic mutation (all mutation types), and (iii) RNA-seq expression data to infer mutation-specific Boolean implications for a given cancer type and a given mutation. Expression data is used as a final filter to ensure that the predicted gene pair has concordant gene expression changes with copy number alterations across cancer types and is over-expressed in the cancer type of interest in the presence of the mutation. (c) MutExSL utilizes copy number and expression to predict SL interactions in two different pathway-based approaches. In the parallel pathway model, deletions or reduced expression in gene A create a susceptibility for targeting the synthetic lethal target, gene B, and thus inhibit both pro-survival networks. Additionally, using a negative feedback-loop model, MutExSL identifies interactions where loss of function of gene A is predicted to increase pro-survival signaling through gene B, thus identifying gene B as a targetable node. (d) Combining computational synthetic lethal tools with drug-gene databases. Many recurrent mutations in cancer occur in undruggable genes and present a major clinical challenge. Mining results from SL tools and drug-gene databases can be used to identify SL relationships involving an undruggable gene where the SL partner is druggable, resulting in a possible mutation-specific sensitivity in a traditionally undruggable situation. *mutated

See this image and copyright information in PMC

References

1. Khwaja A, Bjorkholm M, Gale RE, Levine RL, Jordan CT, Ehninger G et al. Acute myeloid leukaemia. Nat Rev Dis Prim 2016; 2. doi:10.1038/nrdp.2016.10. - DOI - PubMed
1. Ley TJ, Miller C, Ding L, Raphael BJ, Mungall AJ, Robertson G et al. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N Engl J Med 2013; 368: 2059–74. - PMC - PubMed
1. Arber DA, Orazi A, Hasserjian R, Thiele J, Borowitz MJ, Le Beau MM et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016. doi:10.1182/blood-2016-03-643544. - DOI - PubMed
1. Li H-Y, Deng D-H, Huang Y, Ye F-H, Huang L-L, Xiao Q et al. Favorable prognosis of biallelic CEBPA gene mutations in acute myeloid leukemia patients: a meta-analysis. Eur J Haematol 2015. doi:10.1111/ejh.12450. - DOI - PubMed
1. Valk PJM, Verhaak RGW, Beijen MA, Erpelinck CAJ, van Doorn-Khosrovani SB van W, Boer JM et al. Prognostically Useful Gene-Expression Profiles in Acute Myeloid Leukemia. N Engl J Med 2004; 350: 1617–1628. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Data mining for mutation-specific targets in acute myeloid leukemia

Affiliations

Data mining for mutation-specific targets in acute myeloid leukemia

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous