Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 23;383(6685):eadi3808.
doi: 10.1126/science.adi3808. Epub 2024 Feb 23.

An immunogenetic basis for lung cancer risk

Affiliations

An immunogenetic basis for lung cancer risk

Chirag Krishna et al. Science. .

Abstract

Cancer risk is influenced by inherited mutations, DNA replication errors, and environmental factors. However, the influence of genetic variation in immunosurveillance on cancer risk is not well understood. Leveraging population-level data from the UK Biobank and FinnGen, we show that heterozygosity at the human leukocyte antigen (HLA)-II loci is associated with reduced lung cancer risk in smokers. Fine-mapping implicated amino acid heterozygosity in the HLA-II peptide binding groove in reduced lung cancer risk, and single-cell analyses showed that smoking drives enrichment of proinflammatory lung macrophages and HLA-II+ epithelial cells. In lung cancer, widespread loss of HLA-II heterozygosity (LOH) favored loss of alleles with larger neopeptide repertoires. Thus, our findings nominate genetic variation in immunosurveillance as a critical risk factor for lung cancer.

PubMed Disclaimer

Conflict of interest statement

Competing interests:

D.C. and R.M.S. have filed a patent application related to tumor mutational load (17536715). D.C., C.K., and T.L have filed a patent application related to HLA class I sequence divergence and cancer therapy (17770259). M.M. serves on the scientific advisory board and holds stock from Compugen, Myeloid Therapeutics, Morphic Therapeutics, Asher Bio, Dren Bio, Nirogy, Oncoresponse, Owkin, Pionyr, OSE and Larkspur. M.M. serves on the scientific advisory board of Innate Pharma, DBV, and Genenta. All other authors declare no competing interests.

Figures

Fig. 1.
Fig. 1.. HLA genotype and associations with lung cancer risk in UK Biobank and FinnGen.
(A) Correlation of HLA allele frequencies in the UK Biobank with mean allele frequencies across England, Scotland, and Wales was obtained from the Allele Frequency Net Database (AFND). P-value computed using Spearman correlation. (B) Correlation of HLA allele frequencies in FinnGen with allele frequencies from Finland obtained from AFND. P-value calculated using Spearman correlation. (C) Correlation of HLA allele frequencies in UK Biobank with allele frequencies in FinnGen. P-value calculated using Spearman correlation. (D) Rates of heterozygosity at 4-digit allele resolution in UK Biobank. (E) Rates of heterozygosity at 4-digit allele resolution in FinnGen. HLA-DPA1 genotypes were not imputed in FinnGen and are thus left gray. (F) Distribution of age at onset among lung cancer cases compared to age at first assessment in UK Biobank. (G) Distribution of age at onset among lung cancer cases compared to age at first assessment in FinnGen. (H) Multivariable logistic regression analyses testing heterozygosity at the indicated locus together with all clinical and demographic covariates for associations with lung cancer case/control status in UK Biobank. Forest plots depict odds ratio from logistic regression and 95% confidence interval. (I) Multivariable logistic regression analyses testing heterozygosity at the indicated locus and all clinical and demographic covariates for associations with lung cancer case/control status in FinnGen. Forest plots depict odds ratio from logistic regression and 95% confidence interval.
Fig. 2.
Fig. 2.. Maximal HLA-II heterozygosity is associated with reduced lung cancer incidence among smokers in UK Biobank and FinnGen.
(A) Effect of smoking status (current/former/never) on lung cancer incidence in UK Biobank. (B) Effect of smoking status (current/former/never) on lung cancer incidence in FinnGen. (C) Association of maximal HLA-II heterozygosity (10 unique alleles at HLA-DRB1, DQB1, DQA1, DPB1, DPA1) with reduced lung cancer incidence among former smokers in UK Biobank. Heterozygous individuals are denoted by dotted lines; solid lines denote homozygous individuals. (D) Association of maximal HLA-II heterozygosity (8 unique alleles at HLA-DRB1, DQB1, DQA1, DPB1 as DPA1 genotypes were unavailable in FinnGen) with reduced lung cancer incidence among former smokers in FinnGen. Heterozygous individuals are denoted by dotted lines; solid lines denote homozygous individuals. Plots with 95% confidence intervals shown in fig. S9. All P-values were calculated via multivariable Cox regression.
Fig. 3.
Fig. 3.. Heterozygosity at individual HLA-II loci is associated with reduced lung cancer incidence among smokers in UK Biobank and FinnGen.
(A to E) Association of heterozygosity at the indicated HLA-II locus with reduced lung cancer incidence among current and former smokers in UK Biobank. Dotted lines denote heterozygous individuals; solid lines represent homozygous individuals. (F to I) Association of heterozygosity at the indicated HLA-II locus with reduced lung cancer incidence among smokers in FinnGen. Dotted lines denote heterozygous individuals; solid lines represent homozygous individuals. Plots with 95% confidence intervals shown in fig. S9. All P-values were calculated via multivariable Cox regression.
Fig. 4.
Fig. 4.. Heterozygosity fine-mapping and structural analyses of HLA-II peptide binding groove amino acid sequences
(A and B) Associations between heterozygosity at the indicated position of the peptide binding groove of HLA-DRB1 (A) and HLA-DQB1 (B), respectively, and lung cancer risk using a multivariable logistic regression in UK Biobank adjusting for smoking status. The dotted line indicates FDR P = 0.05. Annotation bars indicate polymorphism at the indicated position defined by sequence entropy and distance from peptide based on analysis of representative peptide-MHC crystal structures. (C) Structural visualization of significant amino acid positions from (A) and positions significant after stepwise regression on a representative HLA-DRB1 crystal structure in complex with bound peptide. (D) Structural visualization of significant amino acid positions from (B) and positions significant after stepwise regression on a representative HLA-DQB1 crystal structure in complex with bound peptide.
Fig. 5.
Fig. 5.. Tobacco smoking-induced inflammatory programs identified via single-cell RNA-sequencing analysis of the normal lung from three independent cohorts.
(A) UMAP of normal lung scRNA-seq data from Leader et al. Broad compartments containing multiple clusters are labeled. (B) UMAP of cells from smokers only from Leader et al. (C) UMAP of cells from never-smokers only from Leader et al. (D). Increased prevalence of the C25 alveolar macrophage cluster in smokers compared to never-smokers. Boxplots depict minimum, first quartile, median, third quartile, maximum, and outliers. (E) Upregulation of HLA-II genes in C25 compared to other macrophage clusters from Leader et al. (F) Differential expression analysis comparing smoker C25 cells to never-smoker C25 cells. (G) Pathway analysis using differentially expressed genes from (F) as input. (H) Enrichment of macrophages in a smoker compared to two never-smokers in an independent scRNA-seq dataset from Travaglini et al. (I) Expression of HLA-II cells in antigen-presenting cells (B cells and macrophages) and epithelial cells from an independent scRNA-seq dataset from Kim et al. containing both tumor and normal lung data. (J) Upregulation of HLA-DRB1 expression across immune and epithelial cells in smokers compared to never-smokers from Kim et al.
Fig. 6.
Fig. 6.. HLA-I and HLA-II loss of heterozygosity and immunopeptidome dynamics in lung cancer.
(A) Rates of loss of heterozygosity (LOH) at HLA-I and HLA-II across multiple independent large lung cancer cohorts. HLA LOH at all 8 HLA loci in TCGA was calculated using LOHHLA. The proportion of individuals with loss at any class HLA-I (any one or more of HLA-A/B/C) or any class HLA-II locus (any one or more of HLA-DRB1/DQB1/DQA1/DPB1/DPA1) was determined for LUAD (HLA-I: N = 458, HLA-II: N = 465), LUSC (HLA-I N = 416, HLA-II N = 381), and the full cohort, NSCLC, (HLA-I: N =874 , HLA-II N= 846), and displayed as the mean across six LOHHLA coverage filters (5 to 30 in increments of 5). For individuals evaluated at >=1 HLA-I locus and >=1 HLA-II locus, LOH at only HLA-I was defined as LOH at one or more HLA-I loci but no HLA-II loci, and vice versa for HLA-II only LOH (LUAD N = 437, LUSC N = 347). For PCAWG and Hartwig, HLA-I and HLA-II LOH were determined using the Hartwig Medical Foundation analytical pipeline (33). Loss at any HLA-I locus and any HLA-II locus was calculated similarly for the full NSCLC cohort (TCGA= 784; Hartwig: N = 657, PCAWG: N = 83). A subset of samples in Hartwig and PCAWG were specifically annotated by histology (LUAD or LUSC); for these samples, rates within each histology were also calculated (Hartwig LUAD N = 273, Hartwig LUSC N = 35, PCAWG LUAD N = 36, PCAWG LUSC N = 47). All other samples in Hartwig and PCAWGare labeled in the original metadata as NSCLC and are presented in the rightmost panel NSCLC (LUAD+LUSC), which includes samples with and without histology annotation. Dynamics of the predicted neopeptide repertoire in TCGA LUAD (B) and TCGA LUSC (C) in tumors with and without HLA-II LOH. The neopeptide repertories of heterozygous patients unaffected by LOH are indicated by the red boxes. The neopeptide repertoires of patients with LOH at the specified locus before accounting for peptide loss and after accounting peptide loss due to the LOH event are signified by the green and blue boxes, respectively. Homozygous patients without LOH are shown by the purple boxes. Boxplots in (B) and (C) depict minimum, first quartile, median, third quartile, maximum, and outliers. Numbers above boxplots in (B) and (C) indicate P-values computed with two-sided Wilcoxon test.

References

    1. Herbst RS, V Heymach J, Lippman SM, Lung Cancer. N. Engl. J. Med 359, 1367–1380 (2008). - PMC - PubMed
    1. Siegel RL, Miller KD, Fuchs HE, Jemal A, Cancer statistics, 2022. CA. Cancer J. Clin 72, 7–33 (2022). - PubMed
    1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F, Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA. Cancer J. Clin 71, 209–249 (2021). - PubMed
    1. Yoshida K, Gowers KHC, Lee-Six H, Chandrasekharan DP, Coorens T, Maughan EF, Beal K, Menzies A, Millar FR, Anderson E, Clarke SE, Pennycuick A, Thakrar RM, Butler CR, Kakiuchi N, Hirano T, Hynds RE, Stratton MR, Martincorena I, Janes SM, Campbell PJ, Tobacco smoking and somatic mutations in human bronchial epithelium. Nature. 578, 266–272 (2020). - PMC - PubMed
    1. DOLL R, HILL AB, Smoking and carcinoma of the lung; preliminary report. Br. Med. J 2, 739–748 (1950). - PMC - PubMed

Publication types

Substances