Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 26;13(1):5651.
doi: 10.1038/s41467-022-33351-4.

The comparison of cancer gene mutation frequencies in Chinese and U.S. patient populations

Affiliations

The comparison of cancer gene mutation frequencies in Chinese and U.S. patient populations

Fayang Ma et al. Nat Commun. .

Abstract

Knowing the mutation frequency of cancer genes in China is crucial for reducing the global health burden. We integrate the tumor epidemiological statistics with cancer gene mutation rates identified in 11,948 cancer patients to determine their weighted proportions within a Chinese cancer patient cohort. TP53 (51.4%), LRP1B (13.4%), PIK3CA (11.6%), KRAS (11.1%), EGFR (10.6%), and APC (10.5%) are identified as the top mutated cancer genes in China. Additionally, 18 common cancer types from both China and U.S. cohorts are analyzed and classified into three patterns principally based upon TP53 mutation rates: TP53-Top, TP53-Plus, and Non-TP53. Next, corresponding similarities and prominent differences are identified upon comparing the mutational profiles from both cohorts. Finally, the potential population-specific and environmental risk factors underlying the disparities in cancer gene mutation rates between the U.S. and China are analyzed. Here, we show and compare the mutation rates of cancer genes in Chinese and U.S. population cohorts, for a better understanding of the associated etiological and epidemiological factors, which are important for cancer prevention and therapy.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Workflow of current study.
The workflow for calculating epidemiologically weighted cancer gene mutation proportions within the Chinese cancer population. Source data are provided as a Source Data File.
Fig. 2
Fig. 2. The comparison of the mutation rates of the top50 cancer genes between CN and U.S. patient population.
a The comparison of the epidemiologically weighted mutation rates of the 382 cancer genes derived from CN pan-cancer datasets (n = 11,948 patients) and U.S. pan-cancer datasets (n = 18,584 patients), p = 0.2014 (unpaired t test with Welch’s correction, two-tailed, Welch-Corrected t = 1.279, df = 713; F test: F = 1.707, DFn = 381, Dfd = 381, p < 0.0001), CN _ Mean ± SE = 0.02067 ± 0.001578 (n = 382 genes), U.S. _ Mean±SE = 0.01813 ± 0.001208 (n = 382 genes), 95% confidence interval −0.001354 to 0.006437. b The correlation analysis visualizing the epidemiologically weighted mutation rates of the 382 genes observed in CN and U.S. patient populations, the blue dashed line indicates the linear trend, Pearson r = 0.9542, 95% confidence interval 0.9443 ~ 0.9624, p < 0.0001. c The correlation analysis detailing the mutation rates of the 381 genes (the dominant contribution of TP53 was removed) between the two cohorts, Pearson r = 0.9307, 95% confidence interval 0.9159 to 0.9430, p < 0.0001. d 40 of the top50 mutated cancer genes were shared between CN and U.S. pan-cancer datasets. e–f Comparison of the weighted mutation frequencies of the top50 cancer genes derived from CN (n = 11,948 patients) and U.S. (n = 18,584 patients) pan-cancer datasets, respectively. Mean±Error, error bars represent the 95% confidence limits determined through simulated samples (n = 2000 independent Poisson distributed computational samples with the calculated mutation proportion as the central value), and measure of centre (bar levels) represent mean of simulated mutation proportions. Source data are provided as a Source Data File.
Fig. 3
Fig. 3. Top mutated genes in CN and U.S. cancers.
a, b The top mutated genes for each of the 18 cancer types in the China and U.S pan-cancer cohorts. The case number for each cohort are BLCA (CN_163 vs U.S._411), BRCA (CN_303 vs U.S._1020), CESC (CN_76 vs U.S._289), COCA (CN_1541 vs U.S._545), ESCC (CN_914 vs U.S._95), GACA (CN_973 vs U.S._439), GBM (CN_286 vs U.S._896), HNSC (CN_94 vs U.S._508), KIRC (CN_243 vs U.S._361), LIHC (CN_1131 vs U.S._364), LUAD (CN_1370 vs U.S._1027), LUSC (CN_392 vs U.S._485), OVCA (CN_185 vs U.S._426), PAAD (CN_461 vs U.S._177), PRCA (CN_65 vs U.S._497), PTC (CN_71 vs U.S._346), SKCM (CN_27 vs U.S._366), and UCEC (CN_49 vs U.S._531). Cancer type abbreviations: Esophageal squamous cell carcinoma (ESCC), High-grade serous ovarian carcinoma (OVCA), Lung squamous cell carcinoma (LUSC), Head and neck squamous cell carcinoma (HNSC), Gastric adenocarcinoma (GACA), Bladder urothelial cancer (BLCA), Liver hepatocellular carcinoma (LIHC), Colorectal adenocarcinoma (COCA), Lung adenocarcinoma (LUAD), Breast cancer (BRCA), Pancreatic adenocarcinoma (PAAD), Brain glioblastoma/glioma (GBM), Papillary thyroid carcinoma (PTC), Skin cutaneous melanoma (SKCM), Kidney renal clear cell carcinoma (KIRC), Uterine corpus endometrial carcinoma (UCEC), Cervical squamous cell carcinoma (CESC), Prostate adenocarcinoma (PRCA). Source data are provided as a Source Data File.
Fig. 4
Fig. 4. The classification of 18 common cancer types into three patterns.
ac The comparison of the mutation rates of top 50 cancer genes in each of the 18 most common solid tumor types from China and U.S. cohorts. The 18 cancers are classified into three types: TP53-Top type, TP53-Plus type, and Non-TP53 type, based on the rank of TP53 with respect to other top mutated genes. The differences between China and U.S. cohorts were statistically evaluated using the unpaired t test. * represents 0.01 < p < 0.05, ** represents 0.001 < p < 0.01, *** represents p < 0.001, two-sided, 95% confidence interval. The cancer type abbreviations could be referred to Fig. 3 legend. d The top 50 genes within each cancer type from both populations were overlapped to determine the corresponding common rate in the top 50 genes, which is calculated in 2* (TotalGeneNumber – TotalUniqueGeneNumber) / TotalGeneNumber. Source data are provided as a Source Data File.
Fig. 5
Fig. 5. The comparison of TP53 and EGFR mutation rates between CN and U.S.
a, b The mutation rates of TP53, derived from the 20 most common cancer types, were compared between CN and U.S., respectively. p = 0.6994 (unpaired t test, two-tailed, t = 0.3891, df = 38; F test: F = 1.379, DFn = 19, Dfd = 19, p = 0.4899), CN _ Mean±SE = 0.4003 ± 0.07017 (n = 20), U.S. _ Mean ± SE = 0.3644 ± 0.05975 (n = 20), 95% confidence interval −0.1508 to 0.2225. c The mutation rates of EGFR were visualized and compared in 20 cancers from CN and U.S. cohorts. d, e In LUAD, the mutations rates of KRAS and EGFR between Chinese and U.S. patients were compared with respect to gender and smoking status. f Comparison of the top 50 mutated genes from Chinese (n = 676) and U.S. (n = 191) female LUAD non-smokers, the difference was not statistically significant (NS), p = 0.7221 (unpaired t test with Welch’s correction, two-tailed, t = 0.3568, df = 88; F test: F = 2.007, DFn = 49, Dfd = 49, p = 0.0163), CN _ Mean±SE = 0.05003 ± 0.01532 (n = 50), U.S. _ Mean ± SE = 0.04334 ± 0.01081 (n = 50), 95% confidence interval −0.03064 to 0.04402. g The accumulated mutation rates of KRAS (d) and EGFR (e) in CN and U.S. cohorts, respectively. h The spatial distribution of EGFR mutations in Chinese lung adenocarcinoma patients as female non-smokers visualized with a lollipop graph. Source data are provided as a Source Data File.
Fig. 6
Fig. 6. Mutational signature analysis of lung adenocarcinoma, skin cutaneous melanoma, and papillary thyroid carcinoma in CN and U.S.
a, b The mutation signature derived from the mutational spectra across the top 50 mutated genes in China (LUAD_CN, n = 550) and U.S. (LUAD_U.S., n = 57) female non-smokers. cf The mutation signatures of skin cutaneous melanoma (SKCM_CN, n = 21; SKCM_U.S., n = 225) and papillary thyroid cancer (PTC_CN, n = 67; PTC_U.S., n = 104) derived from the mutational spectra of the top 50 mutated genes in the Chinese and U.S. patient cohorts, respectively. Source data for mutational signature analysis are provided in Supplementary Software 1.

References

    1. Sung H, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clinicians. 2021;71:209–249. doi: 10.3322/caac.21660. - DOI - PubMed
    1. Cao W, Chen HD, Yu YW, Li N, Chen WQ. Changing profiles of cancer burden worldwide and in China: a secondary analysis of the global cancer statistics 2020. Chin. Med. J. 2021;134:783–791. doi: 10.1097/CM9.0000000000001474. - DOI - PMC - PubMed
    1. Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer Statistics, 2021. CA Cancer J. Clinicians. 2021;71:7–33. doi: 10.3322/caac.21654. - DOI - PubMed
    1. Chang MH, et al. Universal hepatitis B vaccination in Taiwan and the incidence of hepatocellular carcinoma in children. Taiwan Childhood Hepatoma Study Group. N. Engl. J. Med. 1997;336:1855–1859. doi: 10.1056/NEJM199706263362602. - DOI - PubMed
    1. Petrick JL, et al. International trends in hepatocellular carcinoma incidence, 1978-2012. Int. J. Cancer. 2020;147:317–330. doi: 10.1002/ijc.32723. - DOI - PMC - PubMed

Publication types

Substances