Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Aug 1:3:e02725.
doi: 10.7554/eLife.02725.

Mismatch repair deficiency endows tumors with a unique mutation signature and sensitivity to DNA double-strand breaks

Affiliations

Mismatch repair deficiency endows tumors with a unique mutation signature and sensitivity to DNA double-strand breaks

Hui Zhao et al. Elife. .

Abstract

DNA replication errors that persist as mismatch mutations make up the molecular fingerprint of mismatch repair (MMR)-deficient tumors and convey them with resistance to standard therapy. Using whole-genome and whole-exome sequencing, we here confirm an MMR-deficient mutation signature that is distinct from other tumor genomes, but surprisingly similar to germ-line DNA, indicating that a substantial fraction of human genetic variation arises through mutations escaping MMR. Moreover, we identify a large set of recurrent indels that may serve to detect microsatellite instability (MSI). Indeed, using endometrial tumors with immunohistochemically proven MMR deficiency, we optimize a novel marker set capable of detecting MSI and show it to have greater specificity and selectivity than standard MSI tests. Additionally, we show that recurrent indels are enriched for the 'DNA double-strand break repair by homologous recombination' pathway. Consequently, DSB repair is reduced in MMR-deficient tumors, triggering a dose-dependent sensitivity of MMR-deficient tumor cultures to DSB inducers.

Keywords: DNA double-strand breaks; DSB inducers; MSI; mismatch repair deficiency; mutation pattern; whole-genome sequencing.

PubMed Disclaimer

Conflict of interest statement

DL, an inventor on a patent application regarding the use of recurrent indels to detect MSI. The VIB is owner of this patent application, and the said patent application has been licensed to an outside company. Neither VIB nor any of the authors have equity stakes in the company. However, VIB stands to eventually receive royalties.

The other authors declare that no competing interests exist.

Figures

Figure 1.
Figure 1.. Somatic mutations in MMR-deficient tumors.
(A) The average frequency of mutations, indels, and substitutions in MMR-deficient tumors vs MMR-proficient tumors, expressed as number of mutations per base (mpb). (B) The fraction of indels and substitutions observed in microsatellites, homopolymers (length over 5 bp), short homopolymers (length of 3–5 bp), and ‘not in repeat regions’ compared to their expected fraction in these regions. (C and D) Frequencies of substitutions (C) and indels (D) in MMR-deficient tumors stratified into exonic, intergenic, and intronic regions. (E) Indel frequencies corrected for homopolymer number, length, and base composition. Indel frequencies in MMR-deficient tumors represent estimates only, as orthogonal technologies revealed false-positive rates of 12%, while false-negative rates in CG and Illumina whole-genomes were estimated to be 27.7% and 0.5%, respectively, by Zook et al. (2014). In MMR-proficient tumors all detected somatic indels were independently validated. DOI: http://dx.doi.org/10.7554/eLife.02725.004
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. The fraction of indels (left panel) and substitutions (right panel) observed in microsatellites, homopolymers, short homopolymers and in nonrepeat regions compared to their expected fraction in these regions.
Data are shown for the individual MMR-deficient tumors. In all three tumors, substitutions predominantly affected non-repeat regions, while indels were mainly confined to homopolymers. DOI: http://dx.doi.org/10.7554/eLife.02725.008
Figure 1—figure supplement 2.
Figure 1—figure supplement 2.. The relative indel frequency defined as the number of indels divided by the total bases of non-homopolymer regions in MMR-deficient tumors stratified into intergenic, exonic, 5′UTR, 3′UTR, and intronic regions is shown.
Indel frequencies in homopolymers are shown in the left panel, whereas indel frequencies in non-homopolymer regions are shown in the right panel. The algorithm we used to correct for homopolymer content, composition, and length can be found in the ‘Materials and methods’ section under the header ‘Evidence of negative clonal selection’. In homopolymer regions, there was a 16% decrease in indel frequency in exonic regions. In non-homopolymer regions, a clear decrease was also observed for exonic regions, confirming that the decrease in exonic indels is not only due to differences in homopolymer characteristics between exonic regions and the rest of the genome. This reveals apparent negative selection in exonic regions, independent of homopolmer content, composition or length of the homopolymers. DOI: http://dx.doi.org/10.7554/eLife.02725.009
Figure 1—figure supplement 3.
Figure 1—figure supplement 3.. Copy number status of the 5 whole-genomes assessed by Illumina Human-Omni1 and CytoSNP-12 chips.
DOI: http://dx.doi.org/10.7554/eLife.02725.010
Figure 2.
Figure 2.. Somatic substitution patterns in MMR-deficient tumors.
(A) Somatic substitution patterns in whole-genome sequences of MMR-deficient endometrial tumors (MMR−), matched germ-line (peripheral white blood cell) DNA from MMR-deficient tumors (MMR-germ-line), de novo mutations as identified in parent-offspring trios (de novo), 1000 Genomes Project (1 KG), the human–chimpanzee divergence panel (Divergence), melanoma and small-cell lung cancer (SCLC), BRCA-deficient breast tumors (BRCA−), MMR-proficient endometrial tumors (MMR+). (B) Somatic substitution frequency per million dinucleotides and per million substitutions. The first row lists the base following the mutated base, the second row lists the base that was mutated, and the third row lists the new base. Gray boxes indicate transitions. Frequencies are depicted color-coded following a logarithmic distribution as shown by the gradient on the left. (C and D) Squared coefficients of correlation (R2) between dinucleotide substitution patterns (C) and between the number of intergenic substitutions per 1 Mb window (D). Substitutions in MMR-proficient and de novo data sets were too sparse for correlations at a 1 Mb scale. (E) Multivariate linear regression modeling of genomic features predicting substitutions frequencies per 1 Mb window in MMR-deficient tumors, and the outcome of the same multivariate linear regression modeling in the germ-line genetic variability panels. T-values resulting from the linear model are displayed as bar plots and indicate direction and significance of correlation (shaded grey box equals p > 0.05, Bonferroni-corrected per model). The de novo substitution frequency was too low to be modeled at this resolution. (F) Frequency of transitions (excluding G:C>A:T in CG) and transversions per 1 Mb window, binned per replication time. Frequencies are displayed relative to the earliest replicating bin. Linear regression analysis was performed to assess whether observed increases were significant and independent of other genomic features. All Bonferroni-corrected p-values were significant (p < 2.0E−5) except for transversions in MMR-deficient tumors, which were not significant (NS; p = 0.23). (G) Effect of homopolymer nucleotide composition (An, Tn, Cn, or Gn) on substitutions immediately flanking a homopolymer. For example, the nucleotide B next to the poly-A repeat 'NNB(A)nBNN' is mostly converted to an A (NNB(A)nANN) and not to a C, G, or T. The modest increase in A substitutions next to Cn homopolymers and T substitutions near Gn homopolymers is caused by C:G>T:A transitions in a CpG context. (H) Substitution frequency in and outside CpG islands, relative to genome-wide substitution frequencies. Data combined for all three MMR-deficient genomes are represented for (B, EH), but individual MMR-deficient genomes display similar patterns (Figure 2—figure supplements 1–5). DOI: http://dx.doi.org/10.7554/eLife.02725.011
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Somatic substitution frequency per million dinucleotides and per million substitutions for the individual MMR-deficient genomes.
The first row lists the base following the mutated base, the second row the base that was mutated, and the third row the new base. Transitions are indicated by grey boxes. Frequencies are depicted color-coded following a logarithmic distribution as shown by the gradient on the right. The average R2 between the MMR-deficient tumors is 0.75. DOI: http://dx.doi.org/10.7554/eLife.02725.012
Figure 2—figure supplement 2.
Figure 2—figure supplement 2.. Multivariate linear regression modeling of genome features predicting substitutions frequencies per 1 Mb window in the individual MMR-deficient genomes.
T-values resulting from the linear model are displayed for each genome feature in the bar plots and indicate significance (shaded grey box equals p > 0.05, Bonferroni-corrected per model) and direction of the correlation. High concordance between the individual tumors is observed. DOI: http://dx.doi.org/10.7554/eLife.02725.013
Figure 2—figure supplement 3.
Figure 2—figure supplement 3.. Frequency of transitions (excluding G:C>A:T in CG) and transversions per 1 Mb window, binned per replication time, relative to the earliest replicating bin.
Mutations are divided in 7 bins (left to right bins represent early to late replication timing events). Linear regression analysis was performed to assess whether observed increases were significant and independent of other genomic features. Bonferroni-corrected p-values were significant (p < 2.0E−5) for transitions and nonsignifciant for transversions. In none of the individual MMR-deficient genomes, transversions were significantly correlated to replication timing, whereas transitions correlated for each of the MMR-deficient genomes. DOI: http://dx.doi.org/10.7554/eLife.02725.014
Figure 2—figure supplement 4.
Figure 2—figure supplement 4.. Effect of homopolymer nucleotide composition (An, Tn, Cn, or Gn) on substitutions immediately flanking a homopolymer in the individual MMR-deficient genomes.
The slight increase in A substitutions next to Cn homopolymers and T substitutions near Gn homopolymers is exclusively caused by C:G>T:A transitions in a CpG context, indicating they are likely deaminations of methylated cytosines. DOI: http://dx.doi.org/10.7554/eLife.02725.015
Figure 2—figure supplement 5.
Figure 2—figure supplement 5.. Frequency of transitions and tranvsersions in and outside of CpG Islands in the individual MMR-deficient genomes.
The frequency of transitions and transversions inside and outside CpG islands was determined as the number of mutations divided by the total size of each of the features, and expressed relative to the general, genome-wide frequencies of transitions and transversions. Individual genomes display similar patterns. DOI: http://dx.doi.org/10.7554/eLife.02725.016
Figure 3.
Figure 3.. Somatic indel patterns in MMR-deficient tumors.
(A) Impact of genomic features in MMR-deficient tumors on indel frequency as assessed by multivariate linear regression modeling. T-values resulting from the linear model are displayed for each genomic feature in the bar plots and indicate significance (shaded grey box equals p > 0.05, Bonferroni-corrected per model) and direction of the correlation. (B) Fraction of all indels inserting or deleting the indicated number of bases. (C) Fraction of homopolymers affected by an indel stratified per nucleotide, compared to the genome-wide fraction of homopolymers with that nucleotide content. DOI: http://dx.doi.org/10.7554/eLife.02725.017
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. The distance between a somatic substitution and the nearest somatic indel (top left), substitution (top right), repeat (bottom left), or homopolymer (bottom right) in the individual MMR-deficient genomes, and the expected distance based on 200 random models.
The substitutions located nearby indels and substitutions were enriched respectively within a range of ∼30 bp and ∼200 bp, whereas substitutions near repeats were enriched only at the base immediately flanking the repeat. DOI: http://dx.doi.org/10.7554/eLife.02725.018
Figure 4.
Figure 4.. Recurrent somatic indels.
(A) The average mutation frequencies in the exons of 13 MMR-deficient tumors and four MMR-proficient tumors. No obvious difference was observed between MLH1-, MSH2-, and MSH6- deficiency in terms of the mutation frequencies, substitution patterns, and indel compositions (Figure 4—figure supplement 5). (B) Fraction of homopolymers affected by an indel in function of the homopolymer length stratified for exons, 5′ and 3′UTRs. (C) The fraction of homopolymers in exons, 5′ and 3′UTRs that are affected by an indel in function of the homopolymer length. (D) Average somatic indel frequencies in exons, 5′ and 3′UTRs of 16 MMR-deficient tumors. (E) The enrichment of observed over expected frequencies of recurrent indels. Enrichments were stratified by length of the affected homopolymer and calculated for recurrent indels in 2, 3, 4, and 5 or more out of 16 MMR-deficient tumors. DOI: http://dx.doi.org/10.7554/eLife.02725.019
Figure 4—figure supplement 1.
Figure 4—figure supplement 1.. Clustering analysis of 13 MMR-deficient exomes for the genes affected by either a somatic substitution or indel in the coding regions.
No obvious subgroups in terms of cancer of origin or between primary tumors and cell cultures were observed. DOI: http://dx.doi.org/10.7554/eLife.02725.023
Figure 4—figure supplement 2.
Figure 4—figure supplement 2.. The fraction of indels (left panel) and substitutions (right panel) identified by whole-exome sequencing, as observed in microsatellites, homopolymers (length over 5 bp), short homopolymers (length of 3–5 bp) and ‘not in repeat regions’ compared to their expected fraction in these regions.
Indels mainly affected homopolymers (59.0%), whereas microsatellites and short homopolymers were affected at a frequency that was expected based on their genome-wide occurrence. In contrast, indels were depleted in non-repeat regions. Substitutions affected the exome independent of repeat composition. These distributions mirror our observations in the MMR-deficient tumors undergoing whole-genome sequencing. DOI: http://dx.doi.org/10.7554/eLife.02725.024
Figure 4—figure supplement 3.
Figure 4—figure supplement 3.. Characteristics of the exonic homopolymers recurrently affected.
For the 477 homopolymers affected in at least 2 out of 16 tumors, respectively 176, 135, 85, and 81 consisted of A, T, G, or C stretches. Out of the 34 homopolymers affected in at least 6 out of 16 tumors, 15, 15, 1, and 3 consisted of A, T, G, or C stretches, respectively. The length of recurrently affected homopolymers (in at least 2 out of 16 tumors) varied from 6 nucleotides to 25 nucleotides, but recurrence was biased towards homopolymers with length 7–9 nucleotides. DOI: http://dx.doi.org/10.7554/eLife.02725.025
Figure 4—figure supplement 4.
Figure 4—figure supplement 4.. The observed and expected frequencies of indels recurrently affected in homopolymers (in at least 2 out of 16 tumors) stratified for homopolymer length and for those affecting coding exonic regions and the 3′UTR.
The difference between observed and expected recurrent indels is high for short homopolymers, but non-existent for long homopolymers. DOI: http://dx.doi.org/10.7554/eLife.02725.026
Figure 4—figure supplement 5.
Figure 4—figure supplement 5.. Mutation patterns obtained from MLH1-deficient, MSH2-deficient, and MSH6-deficient exomes.
(AC) Mutation frequencies. (D) Somatic substitution patterns. (EG) Indel compositions. No obvious difference is observed. DOI: http://dx.doi.org/10.7554/eLife.02725.027
Figure 5.
Figure 5.. The 56-marker panel for MSI testing.
(A) Receiver–operator curve assessing the impact of the number of positive homopolymer markers (out of 59) on the sensitivity and specificity of MSI testing, based on a panel of 236 EM tumors immunohistochemically characterized for their MMR status. (B) The Matthew Correlation Coefficient (MCC) of the ROC curve was calculated for each threshold, and a threshold of 3 resulted in the highest MCC-value (MCC = 0.97). (C and D) The extended Bethesda panel and the 59-marker panel were compared in an independent series of 114 unselected primary endometrial tumors (C) and 126 stage II or III CRC tumors (D). Results were color-coded according to high microsatellite instability (MSI-H; more than 1 markers positive), low microsatellite instability (MSI-L; 1 marker positive), or microsatellite stable status (MSS; 0 markers positive) as determined with the extended Bethesda panel. For endometrial tumors, 71 tumors (62%) were defined as MSS/MSI-L and 43 tumors (38%) as MSI-H by the 59-marker panel. Out of these 43 MSI-H tumors, Bethesda identified 32 tumors as MSI-H (>2 markers positive), 7 tumors as MSI-L, and 5 tumors as MSS. Vice versa, Bethesda did not identify any MSI-H tumor that was not identified by our panel. For colorectal tumors, there were 97 MSS tumors in our 59-marker panel that were concordantly called MSS or MSI-L by the Bethesda panel. The remaining 29 samples were detected as MSI in the 59-marker panel. 28 of these were also called MSI-H by the Bethesda panel, whereas one was called MSS by the Bethesda panel. DOI: http://dx.doi.org/10.7554/eLife.02725.028
Figure 6.
Figure 6.. Reduced DSBR by HR activity in MMR-deficient cells.
(A) Representative confocal images of MMR-deficient and MMR-proficient primary tumor cells exposed for 24 hr to vehicle, 26 μM olaparib, or 300 nM mitomycin C stained for the homologous repair marker RAD51 (green), the DNA damage marker γH2AX (red), and counterstained with DAPI (blue). The bar is 10 µm wide. (B) Quantification of cells containing >5 RAD51 or γH2AX foci. Averages are shown for MMR-deficient and MMR-proficient primary tumor cultures after 24 hr of treatment with vehicle, 26 μM olaparib or 300 nM mitomycin C. DOI: http://dx.doi.org/10.7554/eLife.02725.033
Figure 6—figure supplement 1.
Figure 6—figure supplement 1.. Cell cycle distribution in untreated MMR-deficient and MMR-proficient cell cultures.
No difference was observed in G1, S, or G2/M phase frequency between 7 MMR-deficient and 4 MMR-proficient cultures (p = 0.45, 0.30 or 0.94). DOI: http://dx.doi.org/10.7554/eLife.02725.034
Figure 6—figure supplement 2.
Figure 6—figure supplement 2.. MMR-deficient tumor cultures were challenged with olaparib (26 μM), camptothecin (30 nM), or mitomycin C (300 nM) for 24 hr, pulsed with BrdU for 2 hr and analyzed for cell cycle by propidium iodide staining (DNA content analysis) using flow cytometry.
The bar plot shows the fraction of unlabeled (arrested) cells in S and G2/M, normalized to the G1 fraction; bars indicate SEM; data represent the results from 7 cultures. All experiments were repeated twice. DNA damage provoked by exposure to camptothecin consistently increased stalled (BrdU-negative) cells in S phase (average 13-fold increase; p = 5.23E−5). Mitomycin C caused an increase of stalled cells in S phase (3.08-fold; p = 5.8E−3) and in G2/M phase (3.12-fold; p = 2.2E−7). Olaparib induced, as expected, an increase in stalled cells in S and G2/M (respectively, a 3.35 and a 2.54-fold increase; p = 2.1E−3 and 5.2E−4). Overall, this indicates that MMR-deficient cultures did not exhibit any loss of G2/M cell cycle checkpoints or DNA damage signaling. DOI: http://dx.doi.org/10.7554/eLife.02725.035
Figure 6—figure supplement 3.
Figure 6—figure supplement 3.. Example of a 2 hr BrdU pulse-labeled MMR-deficient cell culture, demonstrating S-phase stalling and G2/M stalling upon mitomycin C exposure, S-phase stalling upon camptothecin exposure and S-phase stalling and G2/M stalling upon olaparib exposure.
Cell cycle phases in unlabeled (stalled) fractions were determined as described by Watson et al. DOI: http://dx.doi.org/10.7554/eLife.02725.036
Figure 7.
Figure 7.. MMR-deficient cells are sensitive to PARP inhibition.
(A) Dosimetry experiments assessing the effect of decreasing concentrations of olaparib on in vitro cell proliferation relative to the corresponding untreated cultures as measured by sulforhodamine B assays. (B) Cytotoxicity of olaparib, mitomycin C, ionizing radiation and paclitaxel as measured by sulforhodamine B assays. Displayed are the average concentrations (μM) or dose (Grey, Gy) that inhibit 50% of the normal growth. p-values are 0.0077, 0.040, and 0.038 for olaparib, mitomycin C, and ionizing radiation, while p-value is not significant (NS) for paclitaxel. (C) Effect of knock-down of BRCA1, BRCA2, and ATR mRNA on olaparib sensitivity of the MMR-proficient, HR-proficient MCF7 cell line. Cells were transfected with the indicated siRNA concentration (X axis), and after 24 hr incubated with 26 µM olaparib or vehicle. Another 48 hr later, cell viability was assessed using the sulforhodamine B assay. The siRNA concentration corresponding to a growth inhibition of 50% was subsequently assessed for the level of knock-down induced. The resulting values are indicated on the plots and are expressed as %. Values plotted were normalized to vehicle-treated cells transfected with a scrambled siRNA of matching concentration. DOI: http://dx.doi.org/10.7554/eLife.02725.037
Figure 7—figure supplement 1.
Figure 7—figure supplement 1.. Cell proliferation of MMR-deficient cultures was measured in real-time using the xCELLigence RTCA DP system (for up to 48 hr after treatment).
Values are normalized to the vehicle-treated control. Error bars represent SEM. The average cell proliferation of 7 MMR-deficient cells (A) and 4 MMR-proficient cells (B) with increasing concentrations of olaparib (1 μM, 3 μM, 10 μM) is shown. Overall, MMR-deficient cultures were characterized by a dose-dependent decrease in proliferation, whereas MMR-proficient cells did not response to olaparib. DOI: http://dx.doi.org/10.7554/eLife.02725.038
Author response image 1.
Author response image 1.
The 54-marker panel generated from 13 Illumina-sequenced exomes for MSI testing.
Author response image 2.
Author response image 2.
Clustering analysis of all samples based on the genes carrying somatic mutations in their coding regions.

References

    1. 1000 Genomes Project Consortium 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65. doi: 10.1038/nature11632 - DOI - PMC - PubMed
    1. Albers CA, Lunter G, MacArthur DG, McVean G, Ouwehand WH, Durbin R. 2011. Dindel: accurate indel calls from short-read data. Genome Research 21:961–973. doi: 10.1101/gr.112326.110 - DOI - PMC - PubMed
    1. Boland CR, Thibodeau SN, Hamilton SR, Sidransky D, Eshleman JR, Burt RW, Meltzer SJ, Rodriguez-Bigas MA, Fodde R, Ranzani GN, Srivastava S. 1998. A National Cancer Institute Workshop on Microsatellite Instability for cancer detection and familial predisposition: development of international criteria for the determination of microsatellite instability in colorectal cancer. Cancer Research 58:5248–5257 - PubMed
    1. Bunting SF, Callén E, Kozak ML, Kim JM, Wong N, López-Contreras AJ, Ludwig T, Baer R, Faryabi RB, Malhowski A, Chen HT, Fernandez-Capetillo O, D'Andrea A, Nussenzweig A. 2012. BRCA1 functions independently of homologous recombination in DNA interstrand crosslink repair. Molecular Cell 46:125–135. doi: 10.1016/j.molcel.2012.02.015 - DOI - PMC - PubMed
    1. Campbell CD, Chong JX, Malig M, Ko A, Dumont BL, Han L, Vives L, O'Roak BJ, Sudmant PH, Shendure J, Abney M, Ober C, Eichler EE. 2012. Estimating the human mutation rate using autozygosity in a founder population. Nature Genetics 44:1277–1281. doi: 10.1038/ng.2418 - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources