Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 29;42(8):112965.
doi: 10.1016/j.celrep.2023.112965. Epub 2023 Aug 21.

Distinct mutational processes shape selection of MHC class I and class II mutations across primary and metastatic tumors

Affiliations

Distinct mutational processes shape selection of MHC class I and class II mutations across primary and metastatic tumors

Michael B Mumphrey et al. Cell Rep. .

Abstract

Disruption of antigen presentation via loss of major histocompatibility complex (MHC) expression is a strategy whereby cancer cells escape immune surveillance and develop resistance to immunotherapy. Here, we develop the personalized genomics algorithm Hapster and accurately call somatic mutations within the MHC genes of 10,001 primary and 2,199 metastatic tumors, creating a catalog of 1,663 non-synonymous mutations that provide key insights into MHC mutagenesis. We find that MHC class I genes are among the most frequently mutated genes in both primary and metastatic tumors, while MHC class II mutations are more restricted. Recurrent deleterious mutations are found within haplotype- and cancer-type-specific hotspots associated with distinct mutational processes. Functional classification of MHC residues reveals significant positive selection for mutations disruptive to the B2M, peptide, and T cell binding interfaces, as well as to MHC chaperones.

Keywords: CP: Cancer; CP: Immunology; HLA; MHC; cancer; evolution; genomics; immunogenetics; selection; somatic mutations.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests An innovation disclosure related to this manuscript has been submitted to the University of Michigan Innovation Partnerships office.

Figures

Figure 1 -
Figure 1 -. Hapster algorithm and validation
(A) Simplified overview of the Hapster algorithm. For each gene, Hapster infers optimal reference sequences Rm, Rn from normal sequencing data, realigns to these personalized references, and calls mutations. For a complete description see supplementary note S2. (B) Germline variants identified from 69 WES samples from the 1000 Genomes project relative to either the standard reference GRCh38, or to dynamically selected references using 8 different haplotypers. A perfect reference sequence should produce 0 apparent germline variants. (C) Fraction of simulated deletions (Del), insertions (Ins), and SNVs that were either called and passed all filters, were called and filtered by either Hapster or Mutect2, or were never called. Shown here are mutations simulated at a VAF of 0.45 and a coverage of ~100x. A range of alternative coverages and VAFs are provided in supplemental figure S2A. (D) QQ plot for observed RNA-seq read support for HLA variants, assuming read support is only due to sequencing error according to a Beta-binomial model. Variants were originally identified by Hapster alone (red), or by both Hapster and Polysolver (blue) from WES data. A comparison is shown to randomly generated alternate bases (grey) which are only supported by noisy reads and follow the null model (diagonal black line). (E) Boxplot showing the number of private germline variants observed per tumor in those cases with or without somatic MHC mutations. Wilcoxon rank-sum test p-values with BH correction are shown, indicating no significant difference in germline variants between both sets of cases. (F) Comparison of non-synonymous mutation calls for the MHC class I genes between the GDAC pipeline, Polysolver, and Hapster across various cancer types from TCGA. Lightly-shaded bars represent possible false positives. (G) Comparison of mutational consequences for variants called by the standard GDAC pipeline, Polysolver, or Hapster in the MHC genes vs oncogenes, tumor suppressors, and neutral gene mutations from TCGA. Oncogenes (OG): KRAS, PIK3CA, IDH1, CTNNB1, FOXA1, BRAF, AKT1, EGFR. Tumor suppressors (TS): TP53, RB1, PTEN, APC, BRCA2, VHL. Neutral genes: All others.
Figure 2 -
Figure 2 -. Compendium of MHC class I and class II mutations in primary and metastatic tumors.
(A) Distribution of all observed mutations in both primary and metastatic cancers across the coding region of the MHC genes. Denoted above are regions of secondary structure corresponding to either the binding pocket floor (β-sheets) or walls (α-helices). (B) Significant differences in the prevalence of nonsynonymous mutations and indels of individual MHC class I and MHC class II genes. *: p < .05; **: p < .01; ***: p < .001; ****: p < 0.0001, BH corrected Fisher's exact test (C) Cohort specific mutation rates for MHC class I and MHC class II genes across all primary and metastatic cancers. Values are scaled to the number of individuals within each cohort. Colors represent the fraction of cancers with nonsynonymous/indel mutations. At neutrality, the expected nonsynonymous rate should be approximately 0.75. (D,E,F,G) Cohort summaries of coding region mutations in MHC class I (D,F) and MHC class II (E,G) genes in primary (D,E) and metastatic (F,G) cancers. Values are scaled by the number of individuals within each cohort.
Figure 3 -
Figure 3 -. Evidence for strong positive selection and deleteriousness of MHC somatic mutations.
(A, B) Top 30 genes showing evidence of positive selection in primary (A) or metastatic (B) cancers by CBaSE by number of cohorts with significant evidence. (C, D) Comparison of the number of cohorts significantly mutated vs pan-cancer metastatic Φpos for protein-coding genes in primary (C) or metastatic (D) cancers as measured by CBaSE. Vertical dashed lines show the cutoff for the top 0.5% of genes by Φpos. (E) Cancer cell fraction (CCF) of MHC class I variants in TCGA cohorts showing significant evidence of positive selection compared to all other cohorts. Vertical line shows 70% CCF, above which mutations are considered clonal. ***: p < 0.001, Wilcoxon rank-sum test after BH correction. (F) Proportion of functional consequences observed in various groups: "TCGA" 2,600,654 pan-cancer mutations from TCGA an approx. neutral model; "Unselected" - MHC class I mutations from all primary and metastatic cohorts showing no evidence of positive selection; others - MHC class I mutations from cohorts showing evidence of positive selection (n = 21-96 mutations within positively selected cohorts). "TCGA" and "Unselected" are average frequencies across cohorts, with error bars showing SEM. (G) Functional consequences of MHC class II mutations in select primary cohorts. For comparison, mutational consequence distribution of known oncogenes (OG: KRAS, PIK3CA, IDH1, CTNNB1, FOXA1, BRAF, AKT1, EGFR) and tumor suppressors (TS: TP53, RB1, PTEN, APC, BRCA2, VHL) are shown. (H) Association of HLA mutations and LOH, numbers indicate portion of mutated tumors (mutated ∣ total) (I) Sample level co-occurrence of mutations in either the MHC class I or APM genes within positively selected cohorts. Percentage values show percent of mutated samples containing a hit in both the MHC class I and APM, with lower percentages suggesting mutual exclusivity.
Figure 4 -
Figure 4 -. Distinct mutational processes shape cancer-type specific MHC mutational patterns
(A) Position of observed frameshift mutations within the MHC class I genes of COAD-MSI and STAD-MSI tumors. Colors show length of homopolymer microsatellites at each observed frameshift. (B) Total number of 1-BP frameshifts observed per tumor at homopolymers of varying length in MSI and MSS tumors. (C) Frequency of 1-BP frameshifts observed at homopolymers of varying length in MSI and MSS tumors. (D) Position of observed stop gain mutations within the MHC class I genes of lymphomas and squamous cell carcinomas. Colors show motif of mutated position. (E) Canonical motif for AID (F) DNA motifs for stop gain mutations observed in lymphoma cohorts. Mutated base marked with red box. (G) Canonical motifs for APOBEC proteins. (H) DNA motifs for stop gain mutations observed in squamous cell carcinoma cohorts. Mutated base marked with red box. (I) % of mutations within Squamous cell carcinomas with (SG) or without (No SG) observed stop gains that can be attributed to signatures SBS2 and SBS13, which have been associated with APOBEC activity. **: p < 0.01, BH corrected t-tests
Figure 5 -
Figure 5 -. MHC class I missense mutations are enriched within specific protein functional domains
(A) Schematic representation of the constructed null models and their comparisons (arrows, colors match comparisons in panel C). (B) Schematic overview of HLA proteins showing secondary structure, exon boundaries, and amino acid interactions with various binding partners. (C) Distribution of observed vs simulated dN/dS ratio fold-change for amino acids that are predicted to interact with various MHC interacting partners. Observed dN/dS ratios are compared to dN/dS ratios from simulations taking into account the mutational signature activities within each of the cohorts showing evidence of positive selection. Rug plots show individual data points. Filled regions show tails of empirical null distribution. (light-red: top and bottom 5%; dark-red: top and bottom 1%) (D-E) dN/dS ratio fold-changes vs simulated / theoretical (D) and observed / empirical (E) null models for individual annotation regions within cohorts showing evidence of positive selection (P: primary cancers; M: Metastatic/refractory cancers). Stars denote observations above the 95th percentile based on the observed null distribution. (F) Structure of the MHC/B2M complex showing 3D clustering of recurrently mutated amino acids. Positions within the MHC protein are colored based on the number of observed mutations (0: green, 1: yellow, 2: orange, 3: pink, 4: magenta, 5: red). Positions mutated 3 or more times are shown with side chains visible. B2M and bound peptide in grey. (G) Change in binding energy due to mutations in amino acids at the HLA:B2M interface as predicted by the software SSIPe.

References

    1. Boon T, Cerottini JC, Van den Eynde B, van der Bruggen P & Van Pel A Tumor antigens recognized by T lymphocytes. Annu. Rev. Immunol 12, 337–365 (1994). - PubMed
    1. Chen DS & Mellman I Oncology meets immunology: the cancer-immunity cycle. Immunity 39, 1–10 (2013). - PubMed
    1. Latchman Y et al. PD-L2 is a second ligand for PD-1 and inhibits T cell activation. Nat. Immunol 2, 261–268 (2001). - PubMed
    1. Lopez Nevot MA et al. Differential expression of HLA class I and II antigens in primary and metastatic melanomas. J. Immunogenet 13, 219–227 (1986). - PubMed
    1. Hodi FS et al. Improved survival with ipilimumab in patients with metastatic melanoma. N. Engl. J. Med 363, 711–723 (2010). - PMC - PubMed

Publication types

Substances