Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Aug;37(8):717-729.
doi: 10.1016/j.tig.2020.10.003. Epub 2020 Nov 13.

Hotspots of Human Mutation

Affiliations
Review

Hotspots of Human Mutation

Alex V Nesta et al. Trends Genet. 2021 Aug.

Abstract

Mutation of the human genome results in three classes of genomic variation: single nucleotide variants; short insertions or deletions; and large structural variants (SVs). Some mutations occur during normal processes, such as meiotic recombination or B cell development, and others result from DNA replication or aberrant repair of breaks in sequence-specific contexts. Regardless of mechanism, mutations are subject to selection, and some hotspots can manifest in disease. Here, we discuss genomic regions prone to mutation, mechanisms contributing to mutation susceptibility, and the processes leading to their accumulation in normal and somatic genomes. With further, more accurate human genome sequencing, additional mutation hotspots, mechanistic details of their formation, and the relevance of hotspots to evolution and disease are likely to be discovered.

Keywords: DNA repair; indel; mutation hotspots; recurrent mutation; selection; structural variation.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Types of Genetic Variation.
Single nucleotide variants (SNVs) and indels are changes that affect between one and 50 base pairs in a single event. (A) Example of a C:T SNV, and a two base pair deletion and a three base pair insertion indel. Examples of events over ≥50 base pairs that constitute SVs are shown in (B); these events include deletions, duplications, inversions, insertions, translocations, and complex combinations of these basic variant types. In each example, the top chromosome is the reference, and the variation is highlighted and displayed beneath.
Figure 2.
Figure 2.
(A) The mechanism(s) of variant formation can be inferred through sequence context as well as the resultant variant. For example, spontaneous deamination of methylated cytosines result in C:T transitions at the GC-rich loci. Apolipoprotein B mRNA editing enzyme, catalytic polypeptide (APOBEC) cytidine deaminases act upon tCw motifs changing them to tTw or tGw (IUPAC code K). This pattern is observed in kataegis. Finally, repetitive DNA regions are subject to indels when the polymerase dissociates from the template and pairs with the incorrect region upon reassociation, resulting in expansions or contractions. (B) Nonallelic homologous recombination can lead to rearrangements via incorrect repair between repeats. This example highlights a 1.5-Mb region in chromosome 17p11-p12. Two segmental duplications (SDs), denoted as SD A and SD B, represent the proximal and distal ~24-Kb CMT1A-REP repeats. These SDs ectopically pair during meiosis, and repair by nonallelic homologous recombination (NAHR), leading to duplication or deletion of the dosage-sensitive gene PMP22, resulting in Charcot-Marie-Tooth disease 1A (CMT1A) or hereditary neuropathy with liability to pressure palsies (HNPP) respectively.
Figure 3.
Figure 3.. Mutations during DNA Replication.
(A) Regions replicated during early S phase generally comprise euchromatic DNA and are localized towards the center of the nucleus. Regions replicated during late S phase are primarily heterochromatic and localized near the nuclear periphery. Replication timing is correlated with mutation type. Structural variants (SVs) mediated by nonallelic homologous recombination (NAHR) primarily occur during early S phase. Structural variants mediated by nonhomologous end-joining (NHEJ) as well as single-base transitions and transversions occur more frequently during late S phase. (B) Transcriptionally active G nucleotide-rich repeat tracts can result in secondary structure-forming G-quadruplexes and R-loops. The stabilization of transcription forks at these loci poses a threat to genomic integrity in two ways. First, single-stranded DNA at R-loops is exposed to apolipoprotein B mRNA editing enzyme catalytic polypeptide-like (APOBEC) family enzymes, allowing C:U mutations on the free strand (indicated as red asterisks). Second, if R-loops are not resolved before replication, the transcription and replication forks may collide, causing replication fork collapse and potentially triggering SV formation. Abbreviations: DNAP, DNA polymerase; RNAP, RNA polymerase.
Figure 4.
Figure 4.. Replication-Based Repair Mechanisms Are Associated with Clustered Mutations.
(A) Repair pathways that expose single-stranded DNA are subject to apolipoprotein B mRNA editing enzyme, catalytic polypeptide (APOBEC)-mediated hypermutation. Following replication fork collapse, mutations are primarily focused on exposed single-stranded DNA. In this break-induced replication repair model, the leading strand from the damaged chromosome invades the homologous region of its mate and forms a displacement loop (gray dashed-line box). This single-stranded DNA is exposed to lesions from APOBEC-mediated deamination. These lesions are retained as the lagging strand of the broken chromosome is synthesized. (B) Replication-based repair mechanisms use the homologous chromosome as a template for repair. This creates localized regions of conservative replication leading to accumulation of variants in the homologous region. The displacement loop created by strand invasion can proceed for regions up to ~1 MB, and this process introduces errors or mutations that are not efficiently repaired.
Figure I.
Figure I.. V(D)J Recombination and Somatic Hypermutation.
Euchromatic H3K4me3 histone modifications at the immunoglobulin locus recruit the recombination complex during B cell development. RAG1 and RAG2 (pink and red ovals) bind two recombination signal sequences (RSS; dark-green and light-green arrows) that flank each V, D, or J coding segment dark-green, light-green, and yellow rectangles, respectively). These segments are processed and ligated together by the NHEJ machinery, usually deleting the intervening sequences. Occasionally, cuts in the RSS (indicated by red triangles in RSS) leave an uneven overhang. Translesion synthesis then fills in the missing information, potentially forming indels during the ligation process. The resulting process generates the antibody-defining coding joint and a signal joint that is circularized to prevent further recombination. After a B cell recognizes an antigen, the immunoglobulin locus may undergo somatic hypermutation to increase antibody specificity. During this process, AID (purple oval) hydrolyzes cytosines on the nontemplate DNA strand into uracil (orange triangles). The converted residue mimics thymidine during DNA replication and the U:G mismatch triggers the DNA repair process, creating point mutations in the immunoglobin gene. Positive selection acts on B cells that best bind a corresponding antigen.

References

    1. Gusella JF et al. (1983) A polymorphic DNA marker genetically linked to Huntington’s disease. Nature 306, 234–238 - PubMed
    1. Andrew SE et al. (1993) The relationship between trinucleotide (CAG) repeat length and clinical features of Huntington’s disease. Nat. Genet 4, 398–403 - PubMed
    1. Falush D et al. (2001) Measurement of mutational flow implies both a high new-mutation rate for Huntington disease and substantial underascertainment of late-onset cases. Am. J. Hum. Genet 68, 373–385 - PMC - PubMed
    1. MacDonald ME et al. (1993) A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington’s disease chromosomes. Cell 72, 971–983 - PubMed
    1. Michaelson JJ et al. (2012) Whole-genome sequencing in autism identifies hot spots for de novo germline mutation. Cell 151, 1431–1442 - PMC - PubMed

Publication types