Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 21;16(1):4717.
doi: 10.1038/s41467-025-59947-0.

Robust prediction of synthetic gRNA activity and cryptic DNA repair by disentangling cellular CRISPR cleavage outcomes

Affiliations

Robust prediction of synthetic gRNA activity and cryptic DNA repair by disentangling cellular CRISPR cleavage outcomes

Stephan Riesenberg et al. Nat Commun. .

Abstract

The ability to robustly predict guide RNA (gRNA) activity is a long-standing goal for CRISPR applications, as it would reduce the need to pre-screen gRNAs. Quantification of formation of short insertions and deletions (indels) after DNA cleavage by transcribed gRNAs has been typically used to measure and predict gRNA activity. We evaluate the effect of chemically synthesized Cas9 gRNAs on different cellular DNA cleavage outcomes and find that the activity of different gRNAs is largely similar and often underestimated when only indels are scored. We provide a simple linear model that reliably predicts synthetic gRNA activity across cell lines, robustly identifies inefficient gRNAs across different published datasets, and is easily accessible via online genome browser tracks. In addition, we develop a homology-directed repair efficiency prediction tool and show that unintended large-scale repair events are common for Cas9 but not for Cas12a, which may be relevant for safety in gene therapy applications.

PubMed Disclaimer

Conflict of interest statement

Competing interests: Related patent applications on DNA-PKcs inhibitors for increasing genome editing efficiency (patent applicant: Max Planck Society; inventors: S.R. and T.M.; application number: EP18215071.4) and GOLD-gRNA (patent applicant: Max Planck Society, inventors: S.R., N.H., and T.M. application number: EP21176366.9) have been filed. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Cellular synthetic gRNA CRISPR-Cas9 cleavage outcome screen.
a Schematic of potential cellular cleavage outcomes after a CRISPR induced DSB. Cells can die as a consequence of the DSB or repair it by competing pathways (NHEJ, MMEJ, HR/HDR), which can result in perfect repair to the wild type sequence, insertions or deletions (indels), or targeted substitutions when a DNA donor is provided. b Chemically synthesized (synthetic) gRNA screen design. 409B2 hiPSCs expressing Cas9 were lipofected with synthetic oligonucleotides in one 96 well per target (n = 78 targets). Cell survival was measured by a resazurin fluorescence assay, followed by subsequent DNA isolation and Illumina sequencing of target PCR amplicons. c Scatter plots of replicates for percentage of cell death in purple, indels in blue, substitutions in green, editing (sum of indels and substitutions) in gold, and in vivo gRNA activity (quantification of cell death and editing) in red, when editing with gRNA and DNA donor. Darker and brighter dots indicate the correlation of replicate one with respect to replicate two and three, respectively. Pearson’s r for correlation (two-tailed) of independent biological replicates (n  =  3) are stated (black frame for editing without DNA donor). All p values are <0.0001. d Histograms showing the density of the distribution of gRNAs that result in different mean (n = 3) percentages of cell death, indels, substitutions, editing, and gRNA activity, when editing with gRNA and DNA donor. For cell death and indels, gray bars represent the distribution for editing without DNA donor. Lines show the respective Lorentzian distributions (black for editing without DNA donor, colored for editing with DNA donor). Source data are provided as a Source Data file.
Fig. 2
Fig. 2. Features influencing synthetic gRNA activity.
a Scatter plots of indels (blue) and in vivo gRNA activity (red) of our synthetic gRNA screen and scores from different gRNA efficiency prediction tools trained on synthetic gRNAs (undisclosed IDT score), in vivo U6 promoter transcribed gRNA,,, and in vitro T7 promoter transcribed gRNAs. Pearson’s r (two-tailed) is stated (*p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001). Each dot represents the mean of independent biological replicates (n = 3) for one gRNA. b Heatmaps of mean gRNA efficiency metrics for all possible PAM-proximal dinucleotide combinations (positions 19 and 20 of the gRNA). The left panel shows that efficiency from a meta-analysis over 70,000 in vivo transcribed gRNAs is highly influenced by PAM-proximal nucleotide composition. In contrast, synthetic gRNAs in our screen and other studies, show limited influence of PAM-proximal nucleotides. c Flowchart of gRNA feature selection and identification of predictive features for synthetic in vivo gRNA activity. d Scatter plots of in vivo gRNA activity and free energy of the 20nt gRNA sequence, MIT off-target score, and number of ‘GA’ dinucleotides. Pearson’s r of the correlation (two-tailed) is stated (**p ≤ 0.01, ***p ≤ 0.001). Each dot represents the mean of independent biological replicates (n = 3) for one gRNA. gRNAs with a PAM-proximal GCC motif previously described to be detrimental for cleavage are depicted as white filled circle in the free energy panel. Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Prediction of synthetic gRNA activity.
a Impact of detrimental gRNA features contributing to cumulative penalty on in vivo gRNA activity that constitutes the EVA activity score. The extent of penalty for the worst gRNA in the screen (GPC6) is shown. The shaded portion of the respective bars shows the individual feature penalty. b Scatter plot of the EVA activity score and measured in vivo gRNA activity from 409B2 iCRISPR hiPSCs used for training (n = 78). Each dot represents the mean of independent biological replicates (n = 3) for one gRNA. Pearson’s r of the correlation (two-tailed) is stated (**p ≤ 0.01, ***p ≤ 0.001). c EVA score and measured in vivo gRNA activity for RNP based editing (n = 9) in 409B2 hiPSCs, HEK293, and eHAP. Each dot represents the mean of independent biological replicates (n = 2) for one gRNA. d EVA score prediction performance for new gRNAs (n = 40) not used for training of the EVA score in 409B2 iCRISPR hiPSCs. Each dot represents the mean of independent biological replicates (n = 3) for one gRNA. e Screenshot of the USCS Genome Browser with a track for precalculated EVA scores that state the score and are color coded in green for efficient and orange for inefficient synthetic gRNAs. Below are gRNA tracks with Rule Set 3 scores for transcribed gRNAs with the Hsu or Chen tracrRNA, colored in percentile bins based on the already available Doench 2016 score CRISPR track. f Percentage of predicted efficient synthetic gRNAs (EVA score bins ≥ 50) and inefficient synthetic gRNAs (EVA score < 50) for the human genome. g The left pie chart shows the percentage of human genome wide ClinVar sites with at least one, zero due to missing proximal NGG Cas9 PAM, or no efficient proximal transcribed gRNA based on the Doench score. The right pie chart depicts the portion of ClinVar sites of the latter condition, that are predicted to be targetable when a synthetic gRNA is used instead. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. EVA score prediction performance for published datasets.
a Scatter plots of gRNA efficiency metrics from published studies,,– and their respective EVA scores. Efficiency metrics are indels (blue), normalized GFP disruption (green), and normalized phenotype depletion (beige). Pearson’s r (two-tailed) (***p ≤ 0.001), cell type, and number of gRNAs in the datasets are stated. The four gRNAs with the lowest measured efficiency in the Doench et al. (2016) dataset are labeled orange (D1–D4), and the four gRNAs with the highest measured efficiency are labeled blue (D5–D8). b An EVA score higher than the measured efficiency (green arrow area) is compatible with missed cellular cleavage outcomes and/or gRNA transcription bias in the original studies, while EVA scores lower than the measured efficiency (gray arrow area) would tend to indicate false negative scoring by our prediction or positive selection of mutation. c In vivo gRNA activity of D1–D8 for editing using synthetic gRNA RNPs in 409B2 hiPSCs. Doench percentile scores (trained on the 2016 dataset including D1–D8) and EVA scores are stated and color coded for predicted gRNA efficiency. Independent biological replicates were performed (n = 2). d Same as c but in HEK293 cells. e Performance evaluation of EVA score prediction of bad gRNAs (EVA score < 50) in published datasets from (a). gRNAs are binned by quartiles of measured efficiencies. f Correctly predicted bad gRNAs (EVA score < 50) that belong to the worst 25% (Q1) of gRNAs of published datasets (see also Fig. 4e). The red portion of the pie chart describes the amount of those inefficient gRNAs, which would have also been predicted as inefficient by either the Hsu or the Chen tracrRNA Rule Set 3 score (<Q1), and the white portion describes inefficient gRNAs that would have only been predicted as inefficient by the EVA score. Source data are provided as a Source Data file.
Fig. 5
Fig. 5. Comparative screening with NHEJ inhibition to infer cryptic repair events.
a Schematic of potential cellular cleavage outcomes after a CRISPR induced DSB, when NHEJ is inhibited by M3814. b Scatter plot of the cleavage outcomes cell death, indels, and substitutions for editing with and without M3814. Each dot represents the mean of independent biological replicates (n = 3) for one gRNA (n = 78 targets). c Conventional indel metric and in vivo gRNA activity with and without M3814. d Scatter plot of in vivo gRNA activity with and without M3814. Because the gRNA activity is independent of repair pathway inhibition by M3814, deviation from a perfect correlation indicates either a tendency for perfect repair (turquoise dots), or large-scale repair (pink dots), both events that are cryptic and overlooked without comparative screening. The gray band indicates comparable activities (±5%) with or without M3814 treatment. e Resazurin assay, Illumina sequencing, and comparative screening with and without DNA donor or M3814 for the targets SSH2 and NOVA1. The dotted lines between the resazurin assay panel and the Illumina sequencing panel indicate that only DNA extracted from surviving cells will be sequenced. The dotted line in the panel of measured outcomes marks the maximum detected cleavage efficiency that is used to infer the amount of perfect repair and large-scale repair Independent biological replicates were performed (n = 3) and error bars show the s.e.m. f Scatter plots of indels and inferred perfect repair (left panel) or large-scale repair (right panel). Pearson’s r (two-tailed) is stated (***p ≤ 0.001). Each dot represents the mean of independent biological replicates (n = 3) for one gRNA (n = 78 targets). g Scatter plot of inDelphi microhomology score and large-scale repair. h Scatter plot of inferred large-scale repair from two neighboring gRNAs (65 bp maximum cut distance). Twenty-five gRNAs have a neighboring tested gRNA in our screen. i Scatter plot of in vivo Cas12a gRNA activity with and without M3814. gRNAs with tendency for perfect repair are shown as turquoise dots. Each dot represents the mean of independent biological replicates (n = 2) for one Cas12a gRNA (n = 20 targets). Source data are provided as a Source Data file.
Fig. 6
Fig. 6. Diverting perfect repair increases editing efficiency.
a Schematic tested combinations of no enzyme, wild type Cas9 (wt), dead Cas9 (dCas9), Cas9-D10A, and Cas9-H840A for editing of the SSH2 locus using two neighboring PAM-out gRNAs. The triangles indicate the active nuclease domains. b Illumina sequencing, measured outcomes (combined cell death and sequencing), and inferred perfect repair for RNP-based editing in 409B2 hiPSCs for combinations from a). Independent biological replicates were performed (n  =  2). Due to the lower cleavage efficiency of the H840A variant, outcomes were not inferred for combinations with H840A. c Deletion frequency percentage and deletion shapes from Illumina sequencing for combinations from (a). Dotted lines indicate the positions of gRNA cleavage sites. Vertical flanks of deletion shapes at cleavage sites are shown by arrows. d Genome editing efficiencies for four targets with a single gRNA (wild type Cas9 RNP), and with an additional neighboring nicking gRNA (Cas9–D10A RNP). Fold-changes of editing efficiency are stated. Independent biological replicates were performed (n = 3) and error bars show the s.e.m. e Deletion frequency percentage and shapes from Illumina sequencing from (d). The blue lines indicate the deletion shape of editing using a single wild type Cas9 gRNA, and the blue dotted lines indicates the cleavage site of said gRNAs. The brown lines indicate the deletion shape with an additional neighboring nicking gRNA, and the black dotted line indicates the additional nicking site. f Scatter dot plot of indels from (d) with additional nicking and in vivo gRNA efficiency determined in the initial synthetic gRNA screen. Pearson’s r (two-tailed) is stated (**p ≤ 0.01). g Genome editing efficiencies for four targets with a Cas9-HiFi RNP in 409B2 hiPSCs with or without M3814 and normal gRNA backbone or GOLD-gRNA, as well as combination of M3814 and GOLD-gRNA. Fold-changes of editing efficiency are stated. Independent biological replicates were performed (n = 2). Source data are provided as a Source Data file.

References

    1. Haapaniemi, E., Botla, S., Persson, J., Schmierer, B. & Taipale, J. CRISPR-Cas9 genome editing induces a p53-mediated DNA damage response. Nat. Med. 24, 927–930 (2018). - PubMed
    1. Ihry, R. J. et al. p53 inhibits CRISPR-Cas9 engineering in human pluripotent stem cells. Nat. Med.24, 939–946 (2018). - PubMed
    1. Fu, Y. et al. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat. Biotechnol.31, 822–826 (2013). - PMC - PubMed
    1. Riesenberg, S. et al. Simultaneous precise editing of multiple genes in human cells. Nucleic Acids Res.47, e116 (2019). - PMC - PubMed
    1. Bennett, E. P. et al. INDEL detection, the ‘Achilles heel’ of precise genome editing: a survey of methods for accurate profiling of gene editing induced indels. Nucleic Acids Res.48, 11958–11981 (2020). - PMC - PubMed

MeSH terms

LinkOut - more resources