Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 9;37(9):1189-1197.
doi: 10.1093/bioinformatics/btaa957.

MEScan: a powerful statistical framework for genome-scale mutual exclusivity analysis of cancer mutations

Affiliations

MEScan: a powerful statistical framework for genome-scale mutual exclusivity analysis of cancer mutations

Sisheng Liu et al. Bioinformatics. .

Abstract

Motivation: Cancer somatic driver mutations associated with genes within a pathway often show a mutually exclusive pattern across a cohort of patients. This mutually exclusive mutational signal has been frequently used to distinguish driver from passenger mutations and to investigate relationships among driver mutations. Current methods for de novo discovery of mutually exclusive mutational patterns are limited because the heterogeneity in background mutation rate can confound mutational patterns, and the presence of highly mutated genes can lead to spurious patterns. In addition, most methods only focus on a limited number of pre-selected genes and are unable to perform genome-wide analysis due to computational inefficiency.

Results: We introduce a statistical framework, MEScan, for accurate and efficient mutual exclusivity analysis at the genomic scale. Our framework contains a fast and powerful statistical test for mutual exclusivity with adjustment of the background mutation rate and impact of highly mutated genes, and a multi-step procedure for genome-wide screening with the control of false discovery rate. We demonstrate that MEScan more accurately identifies mutually exclusive gene sets than existing methods and is at least two orders of magnitude faster than most methods. By applying MEScan to data from four different cancer types and pan-cancer, we have identified several biologically meaningful mutually exclusive gene sets.

Availability and implementation: MEScan is available as an R package at https://github.com/MarkeyBBSRF/MEScan.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Overview of the MEScan framework. A key component of MEScan is a fast and powerful statistical test, TG, for assessing mutual exclusivity of a candidate gene set. This test accounts for a patient- and gene-specific background mutation rate (for illustration, darker blue indicates higher and lighter blue indicates lower background mutation rate). By using a gene-specific weight, the test also balances the impact of each gene on the overall significance of the gene set. Based on this test, our genome-scale screening follows a multi-step procedure. Starting from the observed mutation data matrix, an MCMC algorithm is used to screen across candidate gene sets, where the probability of a gene set being sampled is proportional to the TG score of that set. Next, significant gene sets are identified with the control of the FDR. Finally, high-confidence gene sets are selected based on the criteria that all subsets of them are also significant and they do not have substantial overlaps
Fig. 2.
Fig. 2.
Comparison of power for identifying a true mutually exclusive gene set based on simulations. Each simulated dataset contained 20 genes, including 3 genes with a true mutually exclusive mutational pattern and the other 17 genes without any pattern. Simulations were replicated 100 times and the power was calculated as the frequency that the top-ranked gene set was the 3-gene set with the true mutually exclusive mutational pattern. Four scenarios were considered. (A) The ratio of mutation frequencies was 1:1:1 for the 3 genes and the other 17 genes did not include a highly mutated gene; (B) the ratio of mutation frequencies was 3:2:1 for the 3 genes and the other 17 genes did not include a highly mutated gene; (C) the ratio of mutation frequencies was 1:1:1 for the 3 genes and the other 17 genes included a highly mutated gene; and (D) the ratio of mutation frequencies was 3:2:1 for the 3 genes and the other 17 genes included a highly mutated gene
Fig. 3.
Fig. 3.
High-confidence mutually exclusive gene sets identified from real data analysis. MEScan was applied to identify high-confidence mutually exclusive gene sets based on (A) TCGA glioblastoma (n=290); (B) TCGA lung squamous cell carcinoma (n=174); (C) TCGA ovarian cancer (n=314); and (D) TCGA pan-cancer datasets (n=3205). One selected high-confidence mutually exclusive gene set from each dataset was presented in this figure. A full list of identified high-confidence gene sets is provided in Supplementary Table S3. Pathway diagrams in the figure were generated by using PathwayMapper (Bahceci et al., 2017)

References

    1. Akbani R. et al. (2015) Genomic classification of cutaneous melanoma. Cell, 161, 1681–1696. - PMC - PubMed
    1. Avivar-Valderas A. et al. (2018) Functional significance of co-occurring mutations in PIK3CA and MAP3K1 in breast cancer. Oncotarget, 9, 21444–21458. - PMC - PubMed
    1. Bahceci I. et al. (2017) PathwayMapper: a collaborative visual web editor for cancer pathways and genomic data. Bioinformatics, 33, 2238–2240. - PMC - PubMed
    1. Best S.A. et al. (2018) Synergy between the KEAP1/NRF2 and PI3K pathways drives non-small-cell lung cancer with an altered immune microenvironment. Cell Metab., 27, 935–943. - PubMed
    1. Brennan C.W. et al. (2013) The somatic genomic landscape of glioblastoma. Cell, 155, 462–477. - PMC - PubMed

Publication types