Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Sep 17;116(38):18962-18970.
doi: 10.1073/pnas.1901156116. Epub 2019 Aug 28.

Leveraging protein dynamics to identify cancer mutational hotspots using 3D structures

Affiliations

Leveraging protein dynamics to identify cancer mutational hotspots using 3D structures

Sushant Kumar et al. Proc Natl Acad Sci U S A. .

Abstract

Large-scale exome sequencing of tumors has enabled the identification of cancer drivers using recurrence-based approaches. Some of these methods also employ 3D protein structures to identify mutational hotspots in cancer-associated genes. In determining such mutational clusters in structures, existing approaches overlook protein dynamics, despite its essential role in protein function. We present a framework to identify cancer driver genes using a dynamics-based search of mutational hotspot communities. Mutations are mapped to protein structures, which are partitioned into distinct residue communities. These communities are identified in a framework where residue-residue contact edges are weighted by correlated motions (as inferred by dynamics-based models). We then search for signals of positive selection among these residue communities to identify putative driver genes, while applying our method to the TCGA (The Cancer Genome Atlas) PanCancer Atlas missense mutation catalog. Overall, we predict 1 or more mutational hotspots within the resolved structures of proteins encoded by 434 genes. These genes were enriched among biological processes associated with tumor progression. Additionally, a comparison between our approach and existing cancer hotspot detection methods using structural data suggests that including protein dynamics significantly increases the sensitivity of driver detection.

Keywords: PanCancer; TCGA; cancer driver; hotspot communities; protein dynamics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Workflow of HotCommics to identify putative driver genes: This integrative approach utilizes protein community information along with mapped mutations to identify significantly mutated communities in protein structures. Fisher’s method is employed to quantify the significance of variant enrichment in each community with mapped mutations (thereby defining the hotspot communities).
Fig. 2.
Fig. 2.
Pan-cancer analysis of putative driver genes with hotspot communities. (A) Pan-cancer QQ plot for genes with hotspot communities. (B) PhyloP conservation score comparisons between mutations occupying hotspot communities against nonhotspot communities on protein structures. (C) CADD score correlation between mutations occupying hotspot communities and nonhotspot communities on protein structures. (D) Biological process enrichment analysis for putative driver genes with at least 1 hotspot. The x axis corresponds to the gene ratio quantifying the fraction of putative driver genes belonging to a particular biological process. The color code and size correspond to corrected P value and number of genes involved in the biological process, respectively. (E) Reactome-based pathway enrichment analysis. The color code and size quantify to corrected P value and number of genes involved in the biological process, respectively.
Fig. 3.
Fig. 3.
Comparison with other hotspot detection tools. (A) Comparison of multiple driver detection algorithms represented using the upset plot. We used the most recent version of the CGC database for this analysis. All algorithms were run on the TCGA-MC3 variant call set. Numbers of identified driver genes common to different sets of methods are shown in the bar chart (Upper), and those unique to specific methods in each set are indicated with solid points below the bar chart. (B) QQ plot highlighting differentially expressed putative driver genes across multiple cancer types. (C) Pathway-level enrichment analysis of those singleton genes identified by HotCommics that were novel (with respect to putative driver genes identified by other algorithms and/or the CGC database).
Fig. 4.
Fig. 4.
Examples of a tumor-suppressor gene, an oncogene, and a putative driver with hotspot communities. (A) Hotspot communities (shown in red) in PIK3R1, as identified by our workflow. Previous studies have also identified the PIK3R1 gene as a tumor-suppressor gene. (B) Hotspot communities in BRAF, as identified by our workflow. Previous studies have identified BRAF1 gene as an oncogene. (C) Hotspot communities in PTPRD, as identified by our workflow. PTPRD is an example of a novel putative driver gene.

References

    1. Weinstein J. N., et al. ; Cancer Genome Atlas Research Network , The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013). - PMC - PubMed
    1. Ding L., et al. ; Cancer Genome Atlas Research Network , Perspective on oncogenic processes at the end of the beginning of cancer genomics. Cell 173, 305–320.e10 (2018). - PMC - PubMed
    1. Hudson T. J., et al. ; International Cancer Genome Consortium , International network of cancer genome projects. Nature 464, 993–998 (2010). Erratum in: Nature465, 966 (2010). - PMC - PubMed
    1. Campbell P. J., et al. , Pan-cancer analysis of whole genomes. bioRxiv:10.1101/162784 (12 July 2017).
    1. Matthew Bailey A. H., et al. , Comprehensive characterization of cancer driver genes and mutations article comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–376.e18 (2018). - PMC - PubMed

Publication types

Substances

LinkOut - more resources