Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 Apr;33(4):826-843.
doi: 10.1038/s41375-019-0387-y. Epub 2019 Feb 6.

Data mining for mutation-specific targets in acute myeloid leukemia

Affiliations
Review

Data mining for mutation-specific targets in acute myeloid leukemia

Brooks Benard et al. Leukemia. 2019 Apr.

Abstract

Three mutation-specific targeted therapies have recently been approved by the FDA for the treatment of acute myeloid leukemia (AML): midostaurin for FLT3 mutations, enasidenib for relapsed or refractory cases with IDH2 mutations, and ivosidenib for cases with an IDH1 mutation. Together, these agents offer a mutation-directed treatment approach for up to 45% of de novo adult AML cases, a welcome deluge after a prolonged drought. At the same time, a number of computational tools have recently been developed that promise to further accelerate progress in mutation-specific therapy for AML and other cancers. Technical advances together with comprehensively annotated AML tissue banks have resulted in the availability of large and complex data sets for exploration by the end-user, including (i) microarray gene expression, (ii) exome sequencing, (iii) deep sequencing data of sub-clone heterogeneity, (iv) RNA sequencing of gene expression (bulk and single cell), (v) DNA methylation and chromatin, (vi) and germline quantitative trait loci. Yet few clinicians or experimental hematologists have the time or the training to access or analyze these repositories. This review summarizes the data sets and bioinformatic tools currently available to further the discovery of mutation-specific targets with an emphasis on web-based applications that are open, accessible, user-friendly, and do not require coding experience to navigate. We show examples of how available data can be mined to identify potential targets using synthetic lethality, drug repurposing, epigenetic sub-grouping, and proteomic networks while also highlighting strengths and limitations and the need for superior models for validation.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest:

The authors declare no conflict of interest.

Figures

Fig 1.
Fig 1.. Mathematical approaches for discovering novel gene-gene and gene-drug therapeutic vulnerabilities in cancer.
(a) Pairwise correlation relationships and regression analysis can be used to determine if there is a significant linear relationship between two variables (e.g. expression of gene A vs. gene B may indicate both are in the same cell-type or cellular process); (b) Nonlinear regression analysis of biological data, such as inhibitory concentration at 50% of maximal effect (IC50) dose-response curves, can be identified from large drug/small molecule screens; note some effective hits may not show changes in IC50, but may show important differences in other end-points. (c) Unusual non-normal distributions of certain variables such as bi-modality can be used to predict synthetic lethal gene pairs. (d) Boolean implications resembling IF-THEN rules can be inferred from large data sets that represent unique mutual exclusivity or subset relationships. Tools like STEPMiner can be used to binarize complex variables. (e) Unsupervised hierarchical clustering shown as either a dendrogram and heat-map (upper panel) or an icicle plot (as in the MiSTIC interface, lower panel) can show closely related samples not otherwise apparent from clinical or pathological parameters. Variables can be calculated to be closely related based on different distance metrics (e.g. Euclidian, Manhattan) and clinically actionable sub-groups can be recognized by common patterns. (f) Networks of experimentally validated or bioinformatically predicted “nodes” can be used in machine learning and data mining approaches by applying decision tree learning as a predictive modeling to inferring novel interaction networks, critical nodes in a pathway, and drug repurposing.
Fig 2.
Fig 2.. Algorithm schematics for recent computational tools to predict mutation-specific synthetic lethal interactions.
(a) DAISY uses three different data inputs in parallel to predict synthetic lethal interactions: (i) overlap of somatic copy number alterations and somatic mutation, (ii) results of essentiality from large scale cell-line shRNA screen results, and (iii) pairwise gene co-expression based on Pearson’s correlation score. The overlap between these three orthogonal methods constructs a network of gene pairs for a given cancer that can then be tested in experimental systems. (b) MiSL uses data exclusively from primary cell patient samples across multiple cancer types to find a set of synthetic lethal pairs for a given mutation and a given cancer. The mutation must be recurrent in some pan-cancer samples. Data inputs are (i) copy number, (ii) somatic mutation (all mutation types), and (iii) RNA-seq expression data to infer mutation-specific Boolean implications for a given cancer type and a given mutation. Expression data is used as a final filter to ensure that the predicted gene pair has concordant gene expression changes with copy number alterations across cancer types and is over-expressed in the cancer type of interest in the presence of the mutation. (c) MutExSL utilizes copy number and expression to predict SL interactions in two different pathway-based approaches. In the parallel pathway model, deletions or reduced expression in gene A create a susceptibility for targeting the synthetic lethal target, gene B, and thus inhibit both pro-survival networks. Additionally, using a negative feedback-loop model, MutExSL identifies interactions where loss of function of gene A is predicted to increase pro-survival signaling through gene B, thus identifying gene B as a targetable node. (d) Combining computational synthetic lethal tools with drug-gene databases. Many recurrent mutations in cancer occur in undruggable genes and present a major clinical challenge. Mining results from SL tools and drug-gene databases can be used to identify SL relationships involving an undruggable gene where the SL partner is druggable, resulting in a possible mutation-specific sensitivity in a traditionally undruggable situation. *mutated

References

    1. Khwaja A, Bjorkholm M, Gale RE, Levine RL, Jordan CT, Ehninger G et al. Acute myeloid leukaemia. Nat Rev Dis Prim 2016; 2. doi:10.1038/nrdp.2016.10. - DOI - PubMed
    1. Ley TJ, Miller C, Ding L, Raphael BJ, Mungall AJ, Robertson G et al. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N Engl J Med 2013; 368: 2059–74. - PMC - PubMed
    1. Arber DA, Orazi A, Hasserjian R, Thiele J, Borowitz MJ, Le Beau MM et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016. doi:10.1182/blood-2016-03-643544. - DOI - PubMed
    1. Li H-Y, Deng D-H, Huang Y, Ye F-H, Huang L-L, Xiao Q et al. Favorable prognosis of biallelic CEBPA gene mutations in acute myeloid leukemia patients: a meta-analysis. Eur J Haematol 2015. doi:10.1111/ejh.12450. - DOI - PubMed
    1. Valk PJM, Verhaak RGW, Beijen MA, Erpelinck CAJ, van Doorn-Khosrovani SB van W, Boer JM et al. Prognostically Useful Gene-Expression Profiles in Acute Myeloid Leukemia. N Engl J Med 2004; 350: 1617–1628. - PubMed

Publication types

Substances