Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Nov 24;15(11):e0242780.
doi: 10.1371/journal.pone.0242780. eCollection 2020.

Discovering novel driver mutations from pan-cancer analysis of mutational and gene expression profiles

Affiliations

Discovering novel driver mutations from pan-cancer analysis of mutational and gene expression profiles

Houriiyah Tegally et al. PLoS One. .

Abstract

As the genomic profile across cancers varies from person to person, patient prognosis and treatment may differ based on the mutational signature of each tumour. Thus, it is critical to understand genomic drivers of cancer and identify potential mutational commonalities across tumors originating at diverse anatomical sites. Large-scale cancer genomics initiatives, such as TCGA, ICGC and GENIE have enabled the analysis of thousands of tumour genomes. Our goal was to identify new cancer-causing mutations that may be common across tumour sites using mutational and gene expression profiles. Genomic and transcriptomic data from breast, ovarian, and prostate cancers were aggregated and analysed using differential gene expression methods to identify the effect of specific mutations on the expression of multiple genes. Mutated genes associated with the most differentially expressed genes were considered to be novel candidates for driver mutations, and were validated through literature mining, pathway analysis and clinical data investigation. Our driver selection method successfully identified 116 probable novel cancer-causing genes, with 4 discovered in patients having no alterations in any known driver genes: MXRA5, OBSCN, RYR1, and TG. The candidate genes previously not officially classified as cancer-causing showed enrichment in cancer pathways and in cancer diseases. They also matched expectations pertaining to properties of cancer genes, for instance, showing larger gene and protein lengths, and having mutation patterns suggesting oncogenic or tumor suppressor properties. Our approach allows for the identification of novel putative driver genes that are common across cancer sites using an unbiased approach without any a priori knowledge on pathways or gene interactions and is therefore an agnostic approach to the identification of putative common driver genes acting at multiple cancer sites.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Summary of approach.
In this research, we have identified novel driver mutations by computing the intersection of mutational and gene expression data, and later validated candidate driver mutations using literature mining and pathway analysis. This study pooled together mutational and gene expression data from three cancer types (breast, ovarian and prostate cancers) from TCGA datasets to demonstrate an unbiased approach for cancer-driver gene selection. a) Mutation and gene expression data are processed into mutation and expression matrices for integrative data analysis; b) Pre-selection of genes includes the exclusion of non-pathogenic variants, and an intersection of the remaining mutated genes in the three cancer types (TCGA datasets). c) The pre-selected genes are investigated for their effect on gene expression (as a measure of functionality) by performing differential gene expression analysis. d) The final genes are subjected to gene ontology and pathway enrichment for validation, and the same analysis is performed on patients.
Fig 2
Fig 2. Mutated genes of interest.
Circos plots showing the distribution, across the human genomes, of the 3700 pre-selected genes (inner circle) commonly mutated BRCA-US, OV-US, and PRAD-US cancer data sets, including COSMIC (orange) and non-COSMIC (green) genes (red); The second circle from the middle shows the 1537 cancer-causing candidate genes, with non-COSMIC genes in blue, and COSMIC genes in red labeled with their gene names.
Fig 3
Fig 3. Gene set enrichment & sequences analysis.
a) KEGG pathway enrichment for candidate genes, showing the number of genes with specific enrichment for the most enriched pathways; b) Disease signature enrichment showing gene enrichment in cancer-related conditions. c) Gene and protein length comparison between the candidate genes, COSMIC genes and non-COSMIC genes (Gene-length K-S test p-values: candidate vs. non-cancer genes = 0.0, COSMIC vs. non-cancer genes < 0.001; Protein-length p-values: candidate vs. non-cancer genes = 0.0, COSMIC vs. non-cancer genes < 0.001), d) Percentage of oncogenes (blue) and tumor suppressors (red), as defined by the 20/20 rule [3], in the different gene groups within each cancer type (Chi-square tests of results for candidate-genes vs non-COSMIC genes, and COSMIC genes vs non-COSMIC genes: all p-values < 0.001 for all cancer types for both oncogene and tumor-suppressor classifications).
Fig 4
Fig 4. Driver gene discovery in patients with no alterations in COSMIC genes.
a-c Oncoplots for 4 significant driver genes discovered in patients with no alterations in COSMIC genes. Oncoplots shown for each gene in our complete datasets (all patients). d) Showing the proportion of genes which experience changes in their expression levels when the four specified genes are mutated in each of the three cancer types–showing both under-expression and over-expression effects. e) Showing the classification as oncogene or tumor suppressor of the four genes in each of our three cancer types.

References

    1. Greaves M, Maley CC. Clonal evolution in cancer. Nature. 2012;481: 306–313. 10.1038/nature10762 - DOI - PMC - PubMed
    1. Garraway LA, Lander ES. Lessons from the cancer genome. Cell. 2013;153: 17–37. 10.1016/j.cell.2013.03.002 - DOI - PubMed
    1. Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA, Kinzler KW. Cancer genome landscapes. Science (80-). 2013;339: 1546–1558. 10.1126/science.1235122 - DOI - PMC - PubMed
    1. Doroshow DB, Doroshow JH. Genomics and the history of precision oncology. Surg Oncol Clin N Am. 2020;29: 35–49. 10.1016/j.soc.2019.08.003 - DOI - PMC - PubMed
    1. Zhang J, Baran J, Cros A, Guberman JM, Haider S, Hsu J, et al. International Cancer Genome Consortium Data Portal—a one-stop shop for cancer genomics data. Database (Oxford). 2011;2011: bar026–bar026. 10.1093/database/bar026 - DOI - PMC - PubMed

Publication types

Substances