Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan;55(1):34-43.
doi: 10.1038/s41588-022-01257-y. Epub 2022 Dec 15.

Human genetic diversity alters off-target outcomes of therapeutic gene editing

Affiliations

Human genetic diversity alters off-target outcomes of therapeutic gene editing

Samuele Cancellieri et al. Nat Genet. 2023 Jan.

Abstract

CRISPR gene editing holds great promise to modify DNA sequences in somatic cells to treat disease. However, standard computational and biochemical methods to predict off-target potential focus on reference genomes. We developed an efficient tool called CRISPRme that considers single-nucleotide polymorphism (SNP) and indel genetic variants to nominate and prioritize off-target sites. We tested the software with a BCL11A enhancer targeting guide RNA (gRNA) showing promise in clinical trials for sickle cell disease and β-thalassemia and found that the top candidate off-target is produced by an allele common in African-ancestry populations (MAF 4.5%) that introduces a protospacer adjacent motif (PAM) sequence. We validated that SpCas9 generates strictly allele-specific indels and pericentric inversions in CD34+ hematopoietic stem and progenitor cells (HSPCs), although high-fidelity Cas9 mitigates this off-target. This report illustrates how genetic variants should be considered as modifiers of gene editing outcomes. We expect that variant-aware off-target assessment will become integral to therapeutic genome editing evaluation and provide a powerful approach for comprehensive off-target nomination.

PubMed Disclaimer

Conflict of interest statement

Competing financial interests statement

L.P. has financial interests in Edilytics, Inc., Excelsior Genomics, and SeQure Dx, Inc. L.P.’s interests were reviewed and are managed by Massachusetts General Hospital and Partners HealthCare in accordance with their conflict of interest policies. The remaining authors declare no competing interests.

Figures

Extended Data Figure 1.
Extended Data Figure 1.. Top 100 predicted off-target sites for BCL11A-1617 spacer by CFD score.
CRISPRme search as in Fig. 1. Candidate off-target sites within coding regions based on GENCODE annotations and ATAC-seq peaks in HSCs based on user-provided annotations (data from Corces et al. 2016) are highlighted.
Extended Data Figure 2.
Extended Data Figure 2.. Plots with rank ordered correlation between CFD and CRISTA reported targets.
Scatter plots show from left to right, the correlation of ranked targets, extracted by selecting top 10000 targets ordered by CFD and CRISTA score, respectively. The left plot shows the rank correlation of targets with 0 bulges (Pearson’s correlation: 0.57, p < 1e-10, Spearman’s correlation: 0.55, p < 1e-10), the center plot shows rank correlation of targets with 1 bulge (Pearson’s correlation: −0.16, p < 1 e-10, Spearman’s correlation: −0.33, p < 1e-10) and the right plot shows the rank correlation of targets with 2 bulges (Pearson’s correlation: −0.55, p < 1e-10, Spearman’s correlation: −0.80, p < 1e-10). The correlation values and p-values(two-sided) were calculated using standard functions from the Python scipy library. The colors represent the lowest count of bulges for each target, since the two scoring methods may prioritize different alignments and thus different number of mismatches and bulges of the same genomic target.
Extended Data Figure 3.
Extended Data Figure 3.. HGDP super-population distribution plots.
Cumulative distribution plot of HGDP variant off-targets with CFD≥0.2 and increase in CFD of ≥0.1 per super-population. Individual samples from each of the seven super-populations were shuffled 100 times to calculate the mean and 95% confidence interval (shading around lines). First panel shows distribution within all 54 discrete populations, colored by super-population. Additional seven panels show distribution of discrete populations within each listed super-population.
Extended Data Figure 4.
Extended Data Figure 4.. Candidate transcript off-targets introduced by common genetic variants for non-CRISPR sequence-based RNA-targeting therapeutic strategies.
a) A common SNP (in blue) introduces a candidate CDS off-target site with 2 mismatches for the FDA-approved antisense oligo Nusinersen. b) Top 1000 candidate transcript off-targets ranked by mismatches and bulges for Nusinersen from a search performed with the 1000G and HGDP genetic variant datasets. c) A common insertion variant (in red) introduces a candidate 3’UTR off-target site with 4 mismatches + bulges for the FDA-approved RNAi therapy Inclisiran. d) Top 1000 candidate transcript off-targets ranked by mismatches and bulges for Inclisiran from a search performed with the 1000G and HGDP genetic variant datasets.
Figure 1.
Figure 1.. CRISPRme provides web-based analysis of CRISPR-Cas gene editing off-target potential reflecting population genetic diversity.
a) CRISPRme software takes as input a reference genome, genetic variants, PAM sequence, Cas protein type, spacer sequence, homology threshold and genomic annotations and provides comprehensive, target-focused and individual-focused analyses of off-target potential. It is available as an online webtool and can be deployed locally or used offline as command-line software. b) Analysis of the BCL11A-1617 spacer targeting the +58 erythroid enhancer with SpCas9, NNN PAM, 1000G variants, up to 6 mismatches and up to 2 bulges. c) Top 1000 predicted off-target sites ranked by CFD score, indicating the CFD score of the reference and alternative allele if applicable, with allele frequency indicated by circle size. d) The off-target site with the highest CFD score is created by the minor allele of rs114518452. Coordinates are for hg38 and 0-start for the potential off-target and 1-start for the variant-ID. MAF is based on 1000G. e) The top predicted off-target site from CRISPRme is an allele-specific off-target with 3 mismatches to the BCL11A-1617 spacer sequence, where the rs114518452-C minor allele produces a de novo NGG PAM sequence. PAM sequence shown in bold and mismatches to BCL11A-1617 shown as lowercase. Coordinates are for hg38 and 1-start. f) rs114518452 allele frequencies based on gnomAD v3.1. Coordinates are for hg38 and 1-start. Spacer shown as DNA sequence for ease of visual alignment.
Figure 2.
Figure 2.. CRISPRme provides analysis of off-target potential of CRISPR-Cas gene editing reflecting population and private genetic diversity.
a) CRISPRme analysis was conducted with variants from HGDP comprising whole genome sequencing of 929 individuals from 54 diverse human populations. HGDP variant off-targets with greater CFD scores than the reference genome or 1000G were plotted and sorted by CFD score, with HGDP variant off-targets shown in blue and reference or 1000G variant off-targets shown in red. b) Cumulative distribution plot of HGDP variant off-targets with CFD≥0.2 and increase in CFD of ≥0.1 per super-population. Individual samples from each of the seven super-populations were shuffled 100 times to calculate the mean and 95% confidence interval (shading around lines). c) Intersection analysis of HGDP variant off-targets with CFD≥0.2 and increase in CFD of ≥0.1. Shared variants (orange) were found in 2 or more HGDP samples while private variants (green) were limited to a single sample. d) CRISPRme analysis of a single individual (HGDP01211) showing the top 100 variant off-targets from each of the following three categories: shared with 1000G variant off-targets (left panel), higher CFD score compared to reference genome and 1000G but shared with other HGDP individuals (center panel), and higher CFD score compared to reference genome and 1000G with variant not found in other HGDP individuals (right panel). For the center and right panels, reference refers to CFD score from reference genome or 1000G variants. e) The top predicted private off-target site from HGDP01211 is an allele-specific off-target where the rs1191022522-G minor allele produces a canonical NGG PAM sequence in place of a noncanonical NAG PAM sequence. Spacer shown as DNA sequence for ease of visual alignment.
Figure 3.
Figure 3.. Allele-specific off-target editing by a BCL11A enhancer targeting gRNA in clinical trials associated with a common variant in African-ancestry populations.
a) Human CD34+ HSPCs from a donor heterozygous for rs114518452-G/C (Donor 1, REF/ALT) were subject to 3xNLS-SpCas9:sg1617 RNP electroporation followed by amplicon sequencing of the off-target site around chr2:210,530,659–210,530,681 (off-target-rs114518452 in 1-start hg38 coordinates). CFD scores for the reference and alternative alleles are indicated and representative aligned reads are shown. Spacer shown as DNA sequence for ease of visual alignment, with mismatches indicated by lowercase and the rs114518452 position shown in bold. b) Reads classified based on allele (indeterminate if the rs114518452 position is deleted) and presence or absence of indels (edits). c) Human CD34+ HSPCs from a donor heterozygous for rs114518452-G/C (Donor 1) were subject to 3xNLS-SpCas9:sg1617 RNP electroporation, HiFi-3xNLS-SpCas9:sg1617 RNP electroporation, or no electroporation (mock) followed by amplicon sequencing of the on-target and off-target-rs114518452 sites. Each dot represents an independent biological replicate (n = 3), lines represent medians. Indel frequency was quantified for reads aligning to either the reference or alternative allele. d) Human CD34+ HSPCs from 6 donors homozygous for rs114518452-G/G (Donors 2–7, REF/REF) were subject to 3xNLS-SpCas9:sg1617 RNP electroporation with 1 biological replicate per donor followed by amplicon sequencing of the on-target and off-target-rs114518452 sites.
Figure 4.
Figure 4.. Allele-specific pericentric inversion following BCL11A enhancer editing due to off-target cleavage.
a) Concurrent cleavage of the on-target and off-target-rs114518452 sites could lead to pericentric inversion of chr2 as depicted. PCR primers F1, R1, F2, and R2 were designed to detect potential inversions. b) Human CD34+ HSPCs from a donor heterozygous for rs114518452-G/C (Donor 1) were subject to 3xNLS-SpCas9:sg1617 RNP electroporation, HiFi-3xNLS-SpCas9:sg1617 RNP electroporation, or no electroporation with 3 biological replicates. Human CD34+ HSPCs from 6 donors homozygous for rs114518452-G/G (Donors 2–7, REF/REF) were subject to 3xNLS-SpCas9:sg1617 RNP electroporation with 1 biological replicate per donor. Gel electrophoresis for inversion PCR was performed with F1/F2 and R1/R2 primer pairs on left and right respectively with expected sizes of precise inversion PCR products indicated. c) Reads from amplicon sequencing of the F1/F2 product (expected to include the rs114518452 position) from 3xNLS-SpCas9:sg1617 RNP treatment were aligned to reference and alternative inversion templates. The rs114518452 position is shown in bold. d) Reads classified based on allele (indeterminate if the rs114518452 position deleted). e) Inversion frequency by ddPCR from same samples as in (b) with three replicates from the single REF/ALT donor and one replicate each from the six REF/REF donors. F/F indicates forward and R/R reverse inversion junctions as depicted in (a).
Figure 5.
Figure 5.. CRISPRme illustrates prevalent off-target potential due to genetic variation.
a) Heatmap showing the distribution of alternative allele nominated off-targets for SpCas9 guides by CFD score and MAF. b) UpSet plot showing overlapping annotation categories for candidate off-targets (TSG, tumor suppressor gene; candidate off-targets on the same chromosome as the on-target; CDS regions; cCRE from ENCODE and PAM creation events). c) Top 100 predicted off-target sites ranked by CFD score for the gRNA targeting PCSK9 with no filter, found in cCREs, corresponding to PAM creation events, and in CDS regions) d) Top left: Candidate off-target sites with increased predicted cleavage potential introduced by common (MAF 52% and 26%) indel variants for a SpCas9 gRNA targeting EMX1. Right: Candidate off-target cleavage sites within coding sequences with increased homology to a lead gRNA for SaCas9 targeting of CEP290 to treat congenital blindness in current clinical trials due to common SNPs. Bottom: Potential missense mutations in the EPHB3 tumor suppressor resulting from candidate off-target A-to-G base editing by a preclinical lead gRNA targeting PCSK9 to reduce LDL cholesterol levels. Deletions shown in red, SNPs shown in blue.

Comment in

References

    1. Frangoul H et al. CRISPR-Cas9 Gene Editing for Sickle Cell Disease and β-Thalassemia. N. Engl. J. Med. 384, 252–260 (2021). - PubMed
    1. Anzalone AV, Koblan LW & Liu DR Genome editing with CRISPR-Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824–844 (2020). - PubMed
    1. Clement K, Hsu JY, Canver MC, Joung JK & Pinello L Technologies and Computational Analysis Strategies for CRISPR Applications. Mol. Cell 79, 11–29 (2020). - PMC - PubMed
    1. Bao XR, Pan Y, Lee CM, Davis TH & Bao G Tools for experimental and computational analyses of off-target editing by programmable nucleases. Nat. Protoc. 16, 10–26 (2021). - PMC - PubMed
    1. Hsu PD et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 31, 827–832 (2013). - PMC - PubMed

Methods References

    1. Corces MR et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Gen. 48, 1193–1203 (2016). - PMC - PubMed

Publication types

Substances