Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 11;114(7):988-995.
doi: 10.1093/jnci/djac068.

Cancer Relevance of Human Genes

Affiliations

Cancer Relevance of Human Genes

Tao Qing et al. J Natl Cancer Inst. .

Erratum in

Abstract

Background: We hypothesize that genes that directly or indirectly interact with core cancer genes (CCGs) in a comprehensive gene-gene interaction network may have functional importance in cancer.

Methods: We categorized 12 767 human genes into CCGs (n = 468), 1 (n = 5467), 2 (n = 5573), 3 (n = 915), and more than 3 steps (n = 416) removed from the nearest CCG in the Search Tool for the Retrieval of Interacting Genes/Proteins network. We estimated cancer-relevant functional importance in these neighborhood categories using 1) gene dependency score, which reflects the effect of a gene on cell viability after knockdown; 2) somatic mutation frequency in The Cancer Genome Atlas; 3) effect size that estimates to what extent a mutation in a gene enhances cell survival; and 4) negative selection pressure of germline protein-truncating variants in healthy populations.

Results: Cancer biology-related functional importance of genes decreases as their distance from the CCGs increases. Genes closer to cancer genes show greater connectedness in the network, have greater importance in maintaining cancer cell viability, are under greater negative germline selection pressure, and have higher somatic mutation frequency in cancer. Based on these 4 metrics, we provide cancer relevance annotation to known human genes.

Conclusions: A large number of human genes are connected to CCGs and could influence cancer biology to various extent when dysregulated; any given mutation may be functionally important in one but not in another individual depending on genomic context.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Study schema. Overview of the hypothesis that genes closer to core cancer genes in STRING network are more functionally important in cancer development. STRING = Search Tool for the Retrieval of Interacting Genes/Proteins.
Figure 2.
Figure 2.
Connectedness of cancer genes. A) STRING interaction network. Each dot represents a gene, and colors indicate distance from CCGs. The gray lines show between-gene connections. B) Number of human genes in 4 cancer gene neighborhood categories. C) Proportion of genes implicated in cancer biology in the literature (reported or not reported in connection with cancer) by neighborhood categories. D) Distribution of log2-transformed connectivity score of 12 767 human genes in STRING. E) The distribution of log2-transformed connectivity score for the cancer genes and 4 neighborhood categories. P values were calculated using the 1-sided Mann-Whitney U test (values of closer neighborhood genes are greater than that of all the genes in the remoter steps). CCGs = core cancer genes; STRING = Search Tool for the Retrieval of Interacting Genes/Proteins.
Figure 3.
Figure 3.
Cell viability dependence scores for cancer genes and genes in different cancer gene neighborhood categories. A) Distribution of DepMap CRISPR-based dependency scores. B) Distribution of DepMap RNAi-based dependency scores. Y-axes are dependency scores—the lower the value, the more important the gene is for cell viability. P values were calculated using the 1-sided Mann-Whitney U test (values of closer neighborhood genes are greater than that of all the genes in the remoter steps). CCGs = core cancer genes; CRISPR = CRISPR-Cas9-mediated; RNAi = RNA interference.
Figure 4.
Figure 4.
Somatic mutation frequencies of genes and cancer effect sizes of variants in genes across CCGs and 4 neighborhood categories in 21 well-sampled TCGA cancer types. A) Somatic mutation frequencies of many TCGA types show decreasing somatic mutation frequency for genes with increasing distance from CCGs (FDR < 0.05). B) Average cancer gene effect size (scaled selection coefficients) of variants in all genes of 4 neighborhood categories decrease with increasing distance from CCGs. P values were calculated using the 1-sided Mann-Whitney U test (values of closer neighborhood genes are greater than that of all the genes in the remoter steps). ACC = Adrenocortical carcinoma; BLCA = Bladder Urothelial Carcinoma; BRCA = Breast invasive carcinoma; CCGs = core cancer genes; CESC = Cervical squamous cell carcinoma and endocervical adenocarcinoma; CHOL = Cholangiocarcinoma; COAD = Colon adenocarcinoma; DLBC = Lymphoid Neoplasm Diffuse Large B-cell Lymphoma; ESCA = Esophageal carcinoma; FDR = false discovery rate; GBM = Glioblastoma multiforme; HNSC = Head and Neck squamous cell carcinoma KICH Kidney Chromophobe; KIRC = Kidney renal clear cell carcinoma; KIRP = Kidney renal papillary cell carcinoma; LGG = Brain Lower Grade Glioma; LIHC = Liver hepatocellular carcinoma; LUAD = Lung adenocarcinoma; LUSC = Lung squamous cell carcinoma; MESO = Mesothelioma MISC Miscellaneous; OV = Ovarian serous cystadenocarcinoma; PAAD = Pancreatic adenocarcinoma; PCPG = Pheochromocytoma and Paraganglioma PRAD Prostate adenocarcinoma; READ = Rectum adenocarcinoma; SARC = Sarcoma; SKCM = Skin Cutaneous Melanoma; STAD = Stomach adenocarcinoma; TCGA = The Cancer Genome Atlas; TGCT = Testicular Germ Cell Tumors; THYM = Thymoma; THCA = Thyroid carcinoma; UCS = Uterine Carcinosarcoma; UCEC = Uterine Corpus Endometrial Carcinoma UVM Uveal Melanoma.
Figure 4.
Figure 4.
(continued)
Figure 5.
Figure 5.
Germline selection pressure on genes in different cancer-gene neighborhood categories. A) Selection pressure against protein-truncating variants (PTV): the lower the Sh score, the more tolerant the gene is for a germline PTV. B) Loss-of-function variant intolerance (pLI): the lower the pLI score, the more tolerant the gene is for germline loss-of-function variants. P values were calculated using the 1-sided Mann-Whitney U test (values of closer neighborhood genes are greater than that of all the genes in the remoter steps). CCGs = core cancer genes.

References

    1. Bailey MH, Tokheim C, Porta-Pardo E, et al.; for the Cancer Genome Atlas Research Network. Comprehensive characterization of cancer driver genes and mutations. Cell. 2018;174(4):1034–1035. - PMC - PubMed
    1. Martin GS. The hunting of the Src. Nat Rev Mol Cell Biol. 2001;2(6):467–475. - PubMed
    1. Sondka Z, Bamford S, Cole CG, et al.The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nat Rev Cancer. 2018;18(11):696–705. - PMC - PubMed
    1. ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Pan-cancer analysis of whole genomes. Nature. 2020;578(7793):82–93. - PMC - PubMed
    1. Hanahan D, Weinberg RA.. Hallmarks of cancer: the next generation. Cell. 2011;144(5):646–674. - PubMed

Publication types