Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct 2;16(1):8787.
doi: 10.1038/s41467-025-63842-z.

Cross-ancestral GWAS identifies 29 variants across head and neck cancer subsites

Collaborators, Affiliations

Cross-ancestral GWAS identifies 29 variants across head and neck cancer subsites

Elmira Ebrahimi et al. Nat Commun. .

Abstract

Head and neck squamous cell carcinoma (HNSCC) includes diverse cancers arising in the oral cavity, oropharynx, and larynx, with the main risk factors being environmental exposures such as tobacco, alcohol, and human papillomavirus (HPV) infection. The genetic factors contributing to susceptibility across different populations and tumour subsites remain incompletely understood. Here we show, through a genome-wide association and fine mapping study of over 19,000 HNSCC cases and 38,000 controls from multiple ancestries, 18 genetic risk variants and 11 signals from fine mapping of the human leukocyte antigen (HLA) region, all previously unreported. rs78378222, a regulatory variant for TP53 is associated with a 40% reduction in overall HNSCC risk. We also identify gene-environment interactions, with BRCA2 and ADH1B variants showing effects modified by smoking and alcohol use. Subsite-specific analysis of the HLA region reveals distinct immune-related associations across HPV-positive and HPV-negative tumours. These findings refine the genetic architecture of HNSCC and highlight mechanisms linking inherited variation, immunity, and environmental exposures.

PubMed Disclaimer

Conflict of interest statement

Competing interests: All Authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Novel risk loci identified for HNSCC.
a Circular Manhattan plots showing novel risk loci identified in meta-analyses GWAS. Red labels indicate the cytogenetic locations of novel signals identified in meta-analyses for all sites combined or subsite-specific. Black labels represent previously identified risk loci. Red lines mark the threshold for genome-wide significance (p  =  5 × 10−8). b Circular Manhattan plots from the European GWAS analyses of HPV(+) and HPV(−) oropharyngeal cancer. Separate Manhattan plots for each ancestry group can be found in Supplementary Fig. 1.
Fig. 2
Fig. 2. Overview of genomic and functional characterisation of the 3′ UTR variant rs78378222.
a Regional association plot for the TP53 3′ UTR variant rs78378222 at chromosome 17p13. Each point represents a single-nucleotide polymorphism (SNP) and its association P value (–log₁₀ scale) from a logistic-regression test under an additive genetic model. The horizontal dashed line indicates the genome-wide significance threshold (P = 5 × 10⁸). SNPs are colour-coded by their pairwise linkage disequilibrium (r²) with rs78378222, calculated in European samples from the 1000 Genomes Phase 3 reference panel (r² bins: 0–0.2, 0.2–0.4, 0.4–0.6, 0.6–0.8, 0.8–1.0). b PM-plot of subsite-specific association results for rs78378222. The x-axis represents the m-value, the posterior probability that a genuine genetic effect exists in each head and neck cancer subsite, estimated with METASOFT’s binary-effects (BE) model. An m-value ≥ 0.9 indicates strong evidence for an effect, ≤ 0.1 indicates no effect, and values between 0.1 and 0.9 denote uncertainty. The y-axis displays –log₁₀ P values obtained from the per-allele additive logistic-regression GWAS conducted separately for each subsite. Subsite abbreviations: OC = oral cavity, LA = larynx, HPC = hypopharynx, OPC- = HPV-negative oropharynx, OPC + = HPV-positive oropharynx. Source data are provided as a Source Data file. c Z-Z locus plot showing rs78378222, the lead variant, is associated with reduced TP53 expression in whole blood, with a high PP4 score of 99%. d The cytogenetic location of rs78378222, along with its sequence and allele change, is mapped at the chromosomal level. According to TarBase, this variant overlaps with multiple predicted microRNA binding sites. This Figure was created in BioRender. https://BioRender.com/8c2hqe9. Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Genomic and functional characterisation of 5q31 variants rs28419191 and rs1131769.
a Regional association plot for the two independent lead single-nucleotide polymorphisms (SNPs) rs28419191 and rs1131769 on chromosome 5. Each point shows the –log₁₀ P-value from a logistic-regression test under an additive genetic model in each cohort. The horizontal dashed line marks the genome-wide significance threshold (P = 5 × 10⁸). b PM-plots for rs28419191 and rs1131769. On the plot, the x-axis represents the m-value, the posterior probability that a genuine genetic effect exists in each head and neck cancer subsite, estimated with METASOFT’s binary-effects (BE) model. An m-value ≥ 0.9 indicates strong evidence for an effect, ≤ 0.1 indicates no effect, and values between 0.1 and 0.9 denote uncertainty. The y-axis displays –log₁₀ P-values obtained from the per-allele additive logistic-regression GWAS conducted separately for each subsite. Subsite abbreviations: OC =  ral cavity, LA = larynx, HPC = hypopharynx, OPC- = HPV-negative oropharynx, OPC + = HPV-positive oropharynx. Source data are provided as a Source Data file. c Z-Z locus plot showing colocalization of rs28419191 and rs1131769 with CTNNA1 expression in whole blood, both with a PP4 score of 99%.
Fig. 4
Fig. 4. Gene-environment interactions with alcohol and smoking.
Effect estimates for (a) rs11571833 (BRCA2) (b) rs1229984 (ADH1B), and (c) rs58365910 (CHRNA5) stratified by smoking and drinking (Never smoker-Never drinker, Smoker Only, Drinker Only, and Ever smoker-Ever drinker) from the meta-analysis, and within European and Mixed groups. In each panel, Odds ratios and CIs were estimated by logistic regression under an additive genetic model in each ancestral group and then combined by fixed-effects inverse-variance meta-analysis. For each exposure/genotype category, we report the exact number of independent subjects: never-smoker/never-drinker (n = 876 cases, 2713 controls), drinker only (n = 726 cases, 9552 controls), smoker only (n = 2739 cases, 2242 controls) and ever-ever (both smoker and drinker; n = 4860 cases, 10,002 controls). heterogeneity among the 2 ancestries assessed by Cochran’s Q test. Only the odds ratios (OR) and 95% confidence intervals (CI), p-value and p-heterogeneity (p-het) for the meta-analysis are shown here. Source data are provided as a Source Data file.
Fig. 5
Fig. 5. Cross-ancestry HLA risk loci of HPV( + ) OPC.
a Manhattan plots showing all independent lead variants for risk of HPV( + ) OPC (cases=2,207; controls=38,973). Variants highlighted under the significance threshold reached significance in later rounds; only the plot from the first round of stepwise logistic-regression analysis is shown here. Novel variants are orange; known variants are grey. The horizontal red line reflects the HLA significance threshold (p < 2.4 × 10−6), adjusted using the Bonferroni correction. DRB1 37Asn/Ser, DRB1 233Thr, are within DRB1*13:01-DQA1*01:03-DQB1*06:03, while HLA-B67Cys/Ser/Try was associated with the haplotype. b Out of the five interchangeable amino acid residues in LD with DRB1 233Thr (OR = 1.27, 95% CI:1.17,1.38, p = 7.15 × 10−9), with △BIC ± 2, DRB1 12Lys (OR = 1.25, 95% CI:1.16,1.35, p = 1.97 × 10−8), and DRB1 10Glu/Gln (OR = 1.25, 95% CI:1.16,1.35, p = 1.56 × 10-8) are in the HLA-DR binding pocket and have similar effects. c Model accuracy and risk estimates of amino acid residues and haplotypes. Model 1: six variants identified from fine-mapping, DRB1 233Thr (OR = 1.27, 95% CI: 1.17,1.38, p = 7.15 × 10−9), DRB1 37Asn/Ser (OR = 0.68, 95% CI: 0.63,0.73, p = 3.22 × 10−23), rs2523679 (OR = 0.63, 95% CI: 0.53,0.75, p = 2.26 × 10−7), rs4143334 (OR = 1.89, 95% CI: 1.51,2.35, p = 1.91 × 10−8), rs2308655 (OR = 1.36, 95% CI: 1.25,1.48, p = 3.91 × 10–12), and HLA-B 67Cys/Ser/Tyr (OR = 0.81, 95% CI: 0.74,0.88, p = 1.33 × 10−6), used as the baseline reference model; Model 2: replaces DRB1 37Asn/Ser, DRB1 233Thr (highlighted in red) with the haplotype (OR = 0.53, 95% CI: 0.44,0.65, p = 3.82 × 10−10) (highlighted in blue), effect of HLA-B 67Cys/Ser/Try disappears (OR = 0.95, 95% CI: 0.88,1.02, p = 0.16); Model 3: All 3 amino acids replaced with haplotype (OR = 0.53, 95% CI: 0.44,0.65, p = 3.82 × 10−10). d Allele frequencies of DRB1*13:01-DQA1*01:03-DQB1*06:03 and of having all three amino acid residues by ancestry. e The HLA-B 156Trp amino acid change and the HLA-B 15:01 allele are specific to European ancestry, but the rs2523679 variant, which is in LD with both, has a cross-ancestral effect. His figure is created in BioRender. https://BioRender.com/98q9ivz. Source data are provided as a Source Data file.
Fig. 6
Fig. 6. Novel HLA risk loci for HPV(-) oropharynx and oral cavity cancer.
Manhattan plots display all independent lead variants of risk for HPV(−)(cases = 1470; controls = 38,973) and OC (cases = 5578; controls =38,973) subsite. Variants highlighted under the significance threshold reached significance in later rounds; only the plot from the first round of stepwise logistic-regression analysis is shown here. Novel variants are highlighted in red; known variants are in grey. The horizontal red line reflects the HLA significance threshold (p < 2.4 × 10−6), adjusted using the Bonferroni correction. a HPV(-) oropharynx: The lead SNP, (b) rs1131212 (OR = 1.33, 95% CI:1.19,1.49, p = 5.33 × 10−7), causes an amino acid change from Gln to His at residue 94 located in the HLA-B protein binding pocket (PDB ID: 2BVP). This variant is in LD (r2 = 1) with 70Asn/Ser (OR = 1.32, 95% CI:1.18,1.47, p = 8.81 × 10−7). The right panel shows the comparable risk effects of the two related signals. The known SNP, (c) rs1264813 (OR = 1.37, 95% CI:1.22,1.55, p = 2.77 × 10−7), is in high LD (r2 = 0.77) with HLA-A*24 allele (OR = 1.34, 95% CI:1.18,1.52, p = 7.24 × 10−6) and shows comparable risk effects shown in right panel. d) Oral cavity: The lead SNP, rs9268925 (OR = 0.81, 95% CI: 0.75,0.87, p = 1.36 × 10−7), is highly correlated with a novel risk haplotype, DRB1*15:01-DQA1*01:02-DQB1*06:02 (OR = 0.8, 95% CI:0.73,0.86, p = 2.15 × 10−8), and has a similar risk effect, as shown in the right panel. Model accuracy difference (△BIC) between the original model in the presence of all independent lead variants and the model replacing the lead variant with a related amino acid residue, allele or haplotype, lower than 2 confer equivalent risk. This figure is created in BioRender. https://BioRender.com/98q9ivz. Source data are provided as a Source Data file.

Update of

  • Cross-ancestral GWAS identifies 29 novel variants across Head and Neck Cancer subsites.
    Ebrahimi E, Sangphukieo A, Park HA, Gaborieau V, Ferreiro-Iglesias A, Diergaarde B, Ahrens W, Alemany L, Arantes L, Betka J, Bratman SV, Canova C, Conlon M, Conway DI, Cuello M, Curado M, de Carvalho A, de Oliviera J, Gormley M, Hadji M, Hargreaves S, Healy CM, Holcatova I, Hung RJ, Kowalski LP, Lagiou P, Lagiou A, Liu G, Macfarlane GJ, Olshan AF, Perdomo S, Pinto LF, Podesta JV, Polesel J, Pring M, Rashidian H, Gama RR, Richiardi L, Robinson M, Rodriguez-Urrego PA, Santi SA, Saunders DP, Soares-Lima SC, Timpson N, Vilensky M, von Zeidler SV, Waterboer T, Zendehdel K, Znaor A, Brennan P; HEADSpAcE Consortium; McKay J, Virani S, Dudding T. Ebrahimi E, et al. medRxiv [Preprint]. 2024 Nov 18:2024.11.18.24317473. doi: 10.1101/2024.11.18.24317473. medRxiv. 2024. Update in: Nat Commun. 2025 Oct 2;16(1):8787. doi: 10.1038/s41467-025-63842-z. PMID: 39606392 Free PMC article. Updated. Preprint.

References

    1. Sung, H. et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin.71, 209–249 (2021). - PubMed
    1. Johnson, D. E. et al. Head and neck squamous cell carcinoma. Nat. Rev. Dis. Prim.6, 1–22 (2020). - PMC - PubMed
    1. Lubin, J. H. et al. An examination of male and female odds ratios by BMI, cigarette smoking, and alcohol consumption for cancers of the oral cavity, pharynx, and larynx in pooled data from 15 case-control studies. Cancer Causes Control22, 1217–1231 (2011). - PMC - PubMed
    1. Thomas, S. J., Penfold, C. M., Waylen, A. & Ness, A. R. The changing aetiology of head and neck squamous cell cancer: a tale of three cancers?. Clin. Otolaryngol.43, 999–1003 (2018). - PubMed
    1. Hobbs, C. G. L. et al. Human papillomavirus and head and neck cancer: a systematic review and meta-analysis. Clin. Otolaryngol.31, 259–266 (2006). - PubMed

MeSH terms