Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 25;17(1):95.
doi: 10.1186/s13073-025-01525-6.

Replication-associated mechanisms contribute to an increased CpG > TpG mutation burden in mismatch repair-deficient cancers

Affiliations

Replication-associated mechanisms contribute to an increased CpG > TpG mutation burden in mismatch repair-deficient cancers

Joseph C Ward et al. Genome Med. .

Abstract

Background: Single base substitution (SBS) mutations, particularly C > T and T > C, are increased owing to unrepaired DNA replication errors in mismatch repair-deficient (MMRd) cancers. Excess CpG > TpG mutations have been reported in MMRd cancers defective in mismatch detection (dMutSα), but not in mismatch correction (dMutLα). Somatic CpG > TpG mutations conventionally result from unrepaired spontaneous deamination of 5'-methylcytosine throughout the cell cycle, causing T:G mismatches and signature SBS1. It has been proposed that MutSα detects those mismatches, prior to error correction by base excision repair (BER). However, other evidence appears inconsistent with that hypothesis: for example, MutSα is specifically expressed in S/G2 phases of the cell cycle, and defects in replicative DNA polymerase proofreading specifically cause excess CpG > TpG mutations in signature SBS10b.

Methods: We analysed mutation spectra and COSMIC mutation signatures in whole-genome sequencing data from 1803 colorectal cancers (164 dMutLα, 20 dMutSα) and 596 endometrial cancers (103 dMutLα, 9 dMutSα) from the UK 100,000 Genomes Project. We mapped each C > T mutation to its genomic features, including normal DNA methylation state, replication timing, transcription strand, and replication strand, to investigate the mechanism(s) by which these mutations arise.

Results: We confirmed that dMutSα tumours specifically had higher CpG > TpG burdens than dMutLα tumours. We could fully reconstitute the observed dMutSα CpG > TpG mutation spectrum by adding CpG > TpG mutations in proportion to their SBS1 activity to the dMutLα spectrum. However, other evidence indicated that the SBS1 excess in dMutSα cancers did not come from 5'-methylcytosine deamination alone: non-CpG C > T mutations were also increased in dMutSα cancers; and, in contrast to tumours deficient in BER, CpG > TpG mutations were biased to the leading DNA replication strand, at similar levels in dMutSα and dMutLα cancers, suggesting an origin in DNA replication. Other substitution mutations usually corrected by BER were not increased in dMutSα tumours.

Conclusions: There is a CpG > TpG and SBS1 excess specific to dMutSα MMRd tumours, consistent with previous reports, and we find a general increase in somatic C > T mutations. Contrary to some other studies, the similar leading replication strand bias in both dMutSα and dMutLα tumours indicates that at least some of the excess CpG > TpG mutations arise via DNA replication errors, and not primarily via the replication-independent deamination of 5'-methylcytosine.

Keywords: Colorectal cancer; DNA repair; Mismatch repair deficiency; Mutagenesis; Mutation signatures.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: The Genomics England National Genomic Research Library is a research resource with approval to operate as a Research Tissue Bank by the HRA/Cambridge Central Research Ethics Committee (REC reference 20/EE/0035), with the core principles of the Declaration of Helsinki embedded into day-to-day operations. All 100kGP cancer whole-genome sequencing data used in this study can be accessed within the secure Genomics England Research Environment, subject to institutional agreements (see the Data Availability statement below). Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Colorectal cancers included in mutation analysis. A summary of the primary, treatment-naïve colorectal cancers (CRCs) subjected to PCR-free whole-genome sequencing analysis from V18 of the UK 100,000 Genomes Project. mSINGS = Detecting MSI by Next-Generation Sequencing. MMRp = Mismatch repair-proficient. MMRd = Mismatch repair-deficient. POLE = DNA polymerase ε. POLD1 = DNA polymerase δ
Fig. 2
Fig. 2
The SBS landscapes of MMRp and MMRd colorectal cancers. The total single base substitution (SBS) burden (A) of mismatch repair-proficient (MMRp, grey) and mismatch repair-deficient (MMRd, orange) colorectal cancers. Also shown are the burdens (B) and activities (C) of the six SBS mutation channels: C > A, C > G, C > T (split into CpG > TpG and non-CpG C > T C > T mutations), T > A, T > C and T > G. D The activities of the “clock-like” SBS mutation signatures SBS1 and SBS5, as well as the MMRd-associated signatures SBS15, SBS26, SBS44, and the potential artefact signature SBS57 in the MMRp and MMRd colorectal cancers. Other mutations signatures include SBS2, SBS7c, SBS10a, SBS13, SBS17a, SBS17b, SBS18, SBS28, SBS40, SBS56, SBS93, and SBS94
Fig. 3
Fig. 3
The SBS landscapes of dMutSα and dMutLα colorectal cancers. The total single-base substitution (SBS) burden (A) of MutLα-deficient (dMutLα, purple) or MutSα-deficient (dMutSα, green) colorectal cancers. Also shown are the burdens (B) and activities (C) of the six SBS mutation channels: C > A, C > G, C > T (split into CpG > TpG and non-CpG C > T mutations), T > A, T > C and T > G. D The activities of the “clock-like” SBS mutation signatures SBS1 and SBS5, as well as the MMRd-associated signatures SBS15, SBS21, SBS26, SBS44, and the potential artefact signature SBS57 in the dMutLα and dMutSα colorectal cancers. Other mutation signatures include SBS7c, SBS14, SBS17a, SBS18, SBS20, SBS36, SBS41, SBS93, and mutations not assigned to any pre-existing COSMIC mutation signature
Fig. 4
Fig. 4
Mutation spectra of dMutLα and dMutSα colorectal and endometrial cancers. The activities (proportions of all 96 SBS channels) of each specific substitution are shown for dMutLα colorectal cancers (A), dMutSα colorectal cancers (B), and the same for endometrial cancers (C, D)
Fig. 5
Fig. 5
Addition of CpG > TpG mutations in proportion to SBS1 channel activities to the observed dMutLα mutation spectrum causes near-identity to the observed dMutSα spectrum in colorectal cancer. The y-axis shows the cosine similarity between the CpG > TpG mutation spectrum of dMutSα CRCs and the same spectrum in dMutLα CRCs when SBS1-associated mutations are proportionally added. As more SBS1 mutations are added (i.e. x-axis values increase), the cosine similarity rises to a peak of 0.999. The effects of adding in SBS15 and SBS44 channels are shown for comparison, with no evidence of an effect for either signature
Fig. 6
Fig. 6
De novo mutation signatures extracted from MMRd colorectal cancers. The activities (proportions of all 96 SBS channels) of each specific substitution are shown for the de novo mutation signatures SBSCRC-MMRd-A (A) and SBSCRC-MMRd-B (B), extracted from mismatch repair-deficient (MMRd) colorectal cancers. SBSCRC-MMRd-A and SBSCRC-MMRd-B, had cosine similarity to one another of 0.675, while the sixteen C > T mutation channels were more similar (cosine similarity 0.736). Also presented are the signature plots approximated from the study by Fang et al. [14], Signature A (C) and Signature B (D). Also shown are the pairwise cosine similarities of the sixteen C > T mutation channels of these de novo signatures with the dMutLα and dMutSα colorectal cancer spectrum, as well as the spectrum of the COSMIC signatures SBS1, SBS15, SBS44, and SBS57 (E). The spectrum of the sixteen C > T mutation channels in SBSCRC-MMRd-A most closely resembled the COSMIC signature SBS15 (cosine similarity 0.814), while SBSCRC-MMRd-B most resembled SBS44 (cosine similarity 0.797)
Fig. 7
Fig. 7
Replication strand bias of C > T mutations in MMRd CRC and EC. The replication strand (log2(Leading/Lagging)) bias of CpG > TpG and non-CpG C > T mutations in mismatch repair-deficient (MMRd) colorectal cancers (A) and endometrial cancers (B), classified as MutLα-deficient (dMutLα, purple) or MutSα-deficient (dMutSα, green). P comparing dMutLα to dMutSα from a Wilcoxon test. P comparing CpG > TpG versus non-CpG C > T from a Wilcoxon signed-rank test

References

    1. Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S, Biankin AV, et al. Signatures of mutational processes in human cancer. Nature. 2013;500(7463):415–21. - PMC - PubMed
    1. Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW, Wu Y, et al. The repertoire of mutational signatures in human cancer. Nature. 2020;578(7793):94–101. - PMC - PubMed
    1. Hendrich B, Hardeland U, Ng HH, Jiricny J, Bird A. The thymine glycosylase MBD4 can bind to the product of deamination at methylated CpG sites. Nature. 1999;401(6750):301–4. - PubMed
    1. Ehrlich M, Zhang XY, Inamdar NM. Spontaneous deamination of cytosine and 5-methylcytosine residues in DNA and replacement of 5-methylcytosine residues with cytosine residues. Mutat Res. 1990;238(3):277–86. - PubMed
    1. Tate JG, Bamford S, Jubb HC, Sondka Z, Beare DM, Bindal N, et al. COSMIC: the Catalogue Of Somatic Mutations In Cancer. Nucleic Acids Res. 2019;47(D1):D941–7. - PMC - PubMed

Substances

LinkOut - more resources