Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr;25(4):446-457.
doi: 10.1038/s41593-022-01033-5. Epub 2022 Apr 4.

Exome sequencing of individuals with Huntington's disease implicates FAN1 nuclease activity in slowing CAG expansion and disease onset

Affiliations

Exome sequencing of individuals with Huntington's disease implicates FAN1 nuclease activity in slowing CAG expansion and disease onset

Branduff McAllister et al. Nat Neurosci. 2022 Apr.

Abstract

The age at onset of motor symptoms in Huntington's disease (HD) is driven by HTT CAG repeat length but modified by other genes. In this study, we used exome sequencing of 683 patients with HD with extremes of onset or phenotype relative to CAG length to identify rare variants associated with clinical effect. We discovered damaging coding variants in candidate modifier genes identified in previous genome-wide association studies associated with altered HD onset or severity. Variants in FAN1 clustered in its DNA-binding and nuclease domains and were associated predominantly with earlier-onset HD. Nuclease activities of purified variants in vitro correlated with residual age at motor onset of HD. Mutating endogenous FAN1 to a nuclease-inactive form in an induced pluripotent stem cell model of HD led to rates of CAG expansion similar to those observed with complete FAN1 knockout. Together, these data implicate FAN1 nuclease activity in slowing somatic repeat expansion and hence onset of HD.

PubMed Disclaimer

Conflict of interest statement

V.C.W. is a scientific advisory board member of Triplet Therapeutics, a company developing new therapeutic approaches to address triplet repeat disorders such Huntington’s disease and myotonic dystrophy. V.C.W.’s financial interests in Triplet Therapeutics were reviewed and are managed by Massachusetts General Hospital and Mass General Brigham in accordance with their conflict of interest policies. V.C.W. is also a scientific advisory board member of LoQus23 Therapeutics and has provided paid consulting services to Alnylam. J.-M.L. is on the scientific advisory board of GenEdit. J.D.L. is a paid advisory board member for F. Hoffmann-La Roche and uniQure biopharma and is a paid consultant for Vaccinex, Wave Life Sciences, Genentech, Triplet Therapeutics and PTC Therapeutics. E.H.A. serves on a Data Safety Monitoring Board for Roche. G.B.L. has provided consulting services, advisory board functions, clinical trial services and/or lectures for Allergan, Alnylam, Amarin, AOP Orphan Pharmaceuticals, Bayer Pharma, the CHDI Foundation, GlaxoSmithKline, F. Hoffmann-La Roche, Ipsen, Isis Pharma, Lundbeck, Neurosearch, Medesis, Medivation, Medtronic, NeuraMetrix, Novartis, Pfizer, Prana Biotechnology, Sangamo/Shire, Siena Biotech, Temmler Pharma and Teva Pharmaceuticals. G.B.L. has also received research grant support from the CHDI Foundation, the Bundesministerium für Bildung und Forschung, the Deutsche Forschungsgemeinschaft and the European Commission (EU-FP7, JPND). His study site in Ulm has received compensation in the context of the observational Enroll-HD Study from Teva, Isis, F. Hoffmann-Roche and the Gossweiler Foundation. He receives royalties from Oxford University Press and is employed by the State of Baden-Württemberg at the University of Ulm. A.E.R. is chair of the European Huntington’s Disease Network executive committee and is the global PI for Triplet Therapeutics. J.S.P. has provided consulting services, advisory board functions and clinical trial services for Acadia, F. Hoffman-La Roche, Wave Life Sciences and the CHDI Foundation. J.F.G. is a scientific advisory board member and has a financial interest in Triplet Therapeutics. His National Institutes of Health-funded project is using genetic and genomic approaches to uncover other genes that significantly influence when diagnosable symptoms emerge and how rapidly they worsen in Huntington’s disease. The company is developing new therapeutic approaches to address triplet repeat disorders, such Huntington’s disease, myotonic dystrophy and spinocerebellar ataxias. His interests were reviewed and are managed by Massachusetts General Hospital and Mass General Brigham in accordance with their conflict of interest policies. J.F.G. has also been a consultant for Wave Life Sciences. Within the last 5 years, D.G.M. has been a scientific consultant and/or received honoraria/stock options/research contracts from AMO Pharma, Charles River Laboratories, LoQus23, Small Molecule RNA, Triplet Therapeutics and Vertex Pharmaceuticals. L.J. is a member of the scientific advisory boards of LoQus23 Therapeutics and Triplet Therapeutics. T.H.M. is an associate member of the scientific advisory board of LoQus23 Therapeutics. B.M., J.D., C.S.B., S.P., U.C., G.E., J.S., S.L., L.E., L.-N.S., E.R., G.M., M.C., A.M., M.J.C., E.P.H., D.L., M.E.M., N.M.W., N.D.A. and P.H. have nothing to disclose.

Figures

Fig. 1
Fig. 1. Selection of HD study population with extremes of onset or phenotype.
a, REGISTRY-HD group. Age at motor onset against inherited pure CAG length for 6,086 patients with HD with 40–55 CAGs in the REGISTRY-HD study, using repeat lengths previously determined by PCR fragment length analysis. Individuals with very early (orange, n = 250) or very late (green, n = 250) motor onset given their inherited CAG length were selected for analysis. b, c, PREDICT-HD group, extremes of phenotype. Individuals with more severe (red dots) or less severe (blue dots) clinical phenotypes in the PREDICT-HD cohort were selected for analysis. Residuals from LOESS were used to identify individuals using TMS (n = 117) (b) or SDMT (n = 85) (c) and are plotted against CAP score to visualize age and CAG effects. Higher CAP scores represent greater disease burden. d, PREDICT-HD group, extremely early or late onset (predicted). A time-to-onset model was used to stratify the PREDICT-HD population and select a further cohort of predicted extreme early (red dots) or late (blue dots) onset individuals (n = 119 selected).
Fig. 2
Fig. 2. Non-canonical HTT CAG repeat sequences in expanded alleles are associated with altered onset of HD.
a, Graphical overview of the 3′ end of the HTT exon 1 CAG repeat showing canonical (a–b) and non-canonical (c–h) trinucleotide arrangements identified by ultra-high-depth MiSeq sequencing in the REGISTRY-HD cohort. The clinical onset groups in which the non-canonical alleles were found are indicated on the left. Amino acids encoded by trinucleotides are shown on the right. Gln, glutamine; Pro, proline; His, histidine. b, HTT CAG repeat allele sequences and counts from ultra-high-depth MiSeq sequencing of the REGISTRY-HD cohort (n = 419 passing quality control; n = 213 with early onset relative to inherited pure CAG length; n = 206 with late onset relative to inherited pure CAG length. Note that CAG lengths are derived from MiSeq data). Allele counts for expanded (pathogenic) and unexpanded (wild-type) HTT alleles are shown, for both early-onset and late-onset groups. Allele groups refer to those illustrated in a. Interrupting trinucleotides within the CAG and CCG tracts are highlighted in bold. Range refers to pure CAG lengths in expanded alleles, where applicable. NA, not applicable.
Fig. 3
Fig. 3. Rare deleterious FAN1 variants are associated with altered HD onset and cluster in functional protein domains.
a, Rare, non-synonymous FAN1 variants identified through exome sequencing in the dichotomous HD cohort (n = 637), divided between early/more severe and late/less severe phenotype groups. MAF < 1%. A total of 65 such variants (28 different) were identified across 62 individuals. Three people carried two variants. CADD score is a measure of predicted deleteriousness of a coding variant. CADD ≥ 20 implies that a variant is in the 1% predicted most damaging substitutions in the human genome. A total of 43 individuals carried at least one such predicted damaging variant, with two people carrying two (although these could not be phased). b, FAN1 variants identified in individuals with HD, plotted by CADD score over a cartoon of FAN1 protein. Variants associated mostly with early/more severe phenotype (orange triangles), late/less severe phenotype (green triangles) or neither phenotype group (gray squares, ‘neutral’) are shown. Variants above the CADD = 20 line are predicted to be in the top 1% most damaging variants in the human genome; those with CADD > 10 are predicted to be in the top 10%. Two likely damaging singleton variants lack CADD scores and so are plotted as CADD = 0. They are highlighted: loss-of-function (frameshift) variant ST186SX (*) and in-frame insertion variant V963W964insL (†). FAN1 domain coordinates as published,. c, Damaging FAN1 variants are enriched in individuals with earlier-onset HD after accounting for CAG length. Age at motor onset against CAG length is plotted for the continuous phenotype group (n = 558), with population predicted age at onset for each repeat length shown with horizontal lines. No median onset is shown for CAG lengths of 38 and 39 as they are incompletely penetrant. Individuals with a damaging FAN1 variant (CADD ≥ 20 or loss of function) are shown as black dots; those without one are shown as open circles. d, Three-dimensional model highlighting FAN1 variants selected for downstream study. Note that D960A (*) is a synthetic variant lacking nuclease activity not found in our patient population. NA, not applicable.
Fig. 4
Fig. 4. Nuclease activity of FAN1 variants identified in individuals with HD correlates with residual age at onset of motor symptoms.
a, Lymphoblastoid cell lines derived from patients carrying a heterozygous R507H FAN1 variant are significantly more sensitive to mitomycin C than age-matched and pure CAG-length-matched control HD lines with homozygous wild-type FAN1 (**P = 1.3 × 10−2, two-tailed t-test). Four independent lines for each genotype, mean of three independent experiments shown for each line (dots), as well as mean ± s.e.m. for each genotype (horizontal lines). b, Representative gel showing nuclease activity of FAN1 variants on 5′ flap DNA. FAN1 protein (10 nM) was incubated with fluorescently labeled 5′ flap DNA substrate (5 nM) for 10 minutes at 37 °C in the presence of MnCl2. Reactions were denatured and analyzed using 15% TBE-urea gel electrophoresis. All experiments were repeated at least three times. c, FAN1 variants identified in individuals with HD have significantly reduced nuclease activity compared to wild-type FAN1. Variants associated with early/more severe phenotypes (orange) had less nuclease activity than variants associated with late/less severe phenotypes (green). The nuclease-inactive D981A R982A FAN1 variant was used as a negative control. Activities of variants are normalized to wild-type FAN1 nuclease activity. (n = 3 independent experiments, **P < 1 × 10−2 and ****P < 1 × 10−4; one-way ANOVA; mean ± s.d. shown). d, Graph of mean age at motor onset residual (using pure CAG length from sequencing) against FAN1 nuclease activity for six variants, normalized to wild-type FAN1 activity: R377W n = 10; R507H n = 18; D702E n = 1; K794R n = 1; R982C n = 1; C1004G n = 1. There was a significant correlation between average motor onset residual and in vitro nuclease activity (P = 2.7 × 10−2). Mean ± s.d. nuclease activity is shown for each variant (n = 3 independent experiments). Three individuals had two FAN1 variants: C1004G and R507H; R982C and N621S; and R377W and R507H. For analyses, these individuals were included in the groups of the most damaging of the two variants they carried. See also Supplementary Fig. 3. e, Graph of age at motor onset against CAG length for the continuous phenotype group (n = 558), highlighting those individuals carrying damaging FAN1 variants assayed for nuclease activity. FAN1 nuclease activity: very low (<20% of wild-type), low (20–50% of wild-type) and moderate (50–80% of wild-type). nt, nucleotide; WT, wild-type. Source data
Fig. 5
Fig. 5. FAN1 slows the rate of HTT CAG repeat expansions in a nuclease-dependent manner in an iPSC model of HD.
a, Immunoblot of FAN1 in Q109 HD iPSC lines. Parent Q109 lines with wild-type FAN1 (Q109-n5 (lane 1) and Q109-n1 (lane 5)); Q109 lines with D960A variant (heterozygous (lane 2) and homozygous (lane 3)); and Q109 FAN1 knockout lines (Q109-n5 (lane 4) and Q109-n1 (lane 6)). Blots were repeated twice to confirm results. b, Representative electropherograms of fluorescent PCR and capillary electrophoresis of the HTT CAG repeat in 109-FAN1+/+, 109-FAN1/, 109-FAN1+/D960A and 109-FAN1D960A/D960A iPSCs at baseline and after time in culture. The red dotted line indicates baseline HTT CAG repeat length. c, 109-FAN1/ iPSCs (n = 6 clones) exhibit significantly faster HTT CAG repeat expansion rates than 109-FAN1+/+ iPSCs (n = 7 clones) (0.0661 CAG per day) (P = 1.5 × 10−5). Genome editing performed in Q109-n1 iPSCs. d, Change in modal HTT CAG repeat in post-mitotic neurons generated from 109-FAN1+/+ iPSCs (n = 5) and 109-FAN1/ iPSCs (n = 4 clones). 109-FAN1/ neurons demonstrate significantly faster rates of HTT CAG repeat expansion than 109-FAN1+/+ neurons (P = 1.1 × 10−2). e, FAN1D960A HD-iPSCs show dose-dependent increase in HTT CAG repeat expansion over time in culture (n = 3 clones per genotype, each cultured in triplicate wells). Genome editing performed in Q109-n5 iPSCs. f, D960A mutations enhance HTT CAG repeat expansions in a dose-dependent manner in NPCs derived from 109-FAN1+/+, 109-FAN1+/D960A and 109-FAN1D960A/D960A iPSCs (n = 1 clone per genotype, cultured in triplicate wells). Values are expressed as mean ± s.e.m. Source data
Fig. 6
Fig. 6. Model of how FAN1 nuclease activity might prevent repeat expansions.
The fully base-paired (CAG)•(CTG) tract is in dynamic equilibrium with a four-way junction that includes loop-outs of (CAG)n and (CTG)n on their respective strands. Under normal cellular conditions, most repeats are in their native double-stranded conformation. However, when a longer repeat tract (>35 CAG) is present, it can adopt a more stable four-way structure that can be further bound and stabilized by MSH2/MSH3 (MutSβ) (1). This four-way junction can be cleaved on both strands in either of two orientations (A or B) by MutL complexes (2). The resulting DNA products have long overhangs, either 3′ (A) or 5′ (B), and they can either anneal fully to re-form the starting genomic DNA with no change in repeat tract length (top) or they can slip before partial reannealing (3). Slipped products can have 5′ or 3′ flaps, and these are a substrate for FAN1 nuclease cleavage (bold arrows) and subsequent ligation, yielding repeat contractions (4a). Alternatively, the slipped products can have gaps, and these are substrates for gap-filling DNA polymerases, with subsequent ligation yielding repeat expansions (4b).
Extended Data Fig. 1
Extended Data Fig. 1. Quality control and annotation pipelines for HD exome sequencing data.
a, Quality control pipeline showing where and why sequencing samples were removed from the dataset. From an initial 785 sequenced exomes, including some samples re-sequenced due to initial low quality, 683 passed all quality control steps (465 from REGISTRY-HD, 218 from PREDICT-HD). Subgroups of this population were used in downstream analyses: a continuous group (N = 558) containing all individuals with a known age at motor onset and a dichotomous group (N = 637) containing all individuals with an extreme phenotype, either early or late actual or predicted onset of symptoms, or more or less severe motor or cognitive symptom scores. See also Fig. 1 and Supplementary Fig. 1. b, Annotation pipeline indicating the pathway, databases (gnomAD & dbSNFP) and tools used to annotate individual variants across exomes. Key: VEP, variant effect predictor tool. See also Supplementary Fig. 2.
Extended Data Fig. 2
Extended Data Fig. 2. FAN1 knockout and D960A editing using CRISPR-Cas9.
a, Schematic depicting CRISPR-Cas9 targeting of exon 2 of FAN1 in Q109-n1 and Q109-n5 using two guide RNAs (gRNAs) to induce a 94 bp deletion leading to a premature stop codon and FAN1 knockout. The primer pair used for PCR screening of exon 2 after CRISPR is also shown. b, Diagnostic PCR screen using primers FAN1-KO-F and FAN1-KO-R showing representative banding patterns for iPSC lines with the three possible FAN1 genotypes after CRISPR: FAN1+/+ (230/230 bp; wild-type), FAN1+/− (230/135 bp) and FAN1−/− (135/135 bp). c, Sanger sequencing of PCR products demonstrates the targeted 94 bp deletion in exon 2 of FAN1. d, Undifferentiated iPSCs stained for the pluripotency marker OCT4. iPSC-derived neurons stained positive for the neuronal marker MAP2 (red) and CTIP2 (green). All nuclei are counterstained with DAPI (blue). e, Schematic depicting CRISPR-Cas9 targeting of exon 13 of FAN1 in Q109-n5 using a homology directed repair (HDR) template. A single guide RNA sequence (grey) and a 122 bp HDR template containing the desired gene edit coding for an amino acid change (D960A) were utilised to generate FAN1-nuclease dead clones. The HDR template contained two silent mutations (lowercase) to prevent Cas9 re-cutting of the edited region and to introduce a StuI restriction site for diagnostic screening. f, Restriction digest with StuI confirms Q109-n5 FAN1+/+, Q109-n5 FAN1+/D960A and Q109-n5 FAN1D960A/D960A genotypes. StuI cleaves the 442 bp PCR product into 124 and 318 bp products only in the presence of the silent 2868 G > A mutation. The parental Q109-n5 line is also shown as a negative control (right). g, Sanger sequencing of PCR products confirms successful introduction of D960A variant. h, Virtual karyotyping of iPSC lines using copy number variant (CNV) analysis at the time of repeat expansion experiments. CNV analysis reveals small deletions at 2q22.1 and 14q24.3 and duplications at 12q14.2 in all samples. Duplication of chromosome 1 shown in all but one sample. Source data

References

    1. McAllister B, et al. Timing and impact of psychiatric, cognitive, and motor abnormalities in Huntington disease. Neurology. 2021;96:e2395–e2406. - PMC - PubMed
    1. Bates GP, et al. Huntington disease. Nat. Rev. Dis. Prim. 2015;1:15005. - PubMed
    1. A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington’s disease chromosomes. The Huntington’s Disease Collaborative Research Group. Cell72, 971–983 (1993). - PubMed
    1. Andrew SE, et al. The relationship between trinucleotide (CAG) repeat length and clinical features of Huntington’s disease. Nat. Genet. 1993;4:398–403. - PubMed
    1. Duyao M, et al. Trinucleotide repeat length instability and age of onset in Huntington’s disease. Nat. Genet. 1993;4:387–392. - PubMed

Publication types

MeSH terms

LinkOut - more resources