Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 16;46(3):1375-1385.
doi: 10.1093/nar/gkx1268.

Refined sgRNA efficacy prediction improves large- and small-scale CRISPR-Cas9 applications

Affiliations

Refined sgRNA efficacy prediction improves large- and small-scale CRISPR-Cas9 applications

Maurice Labuhn et al. Nucleic Acids Res. .

Abstract

Genome editing with the CRISPR-Cas9 system has enabled unprecedented efficacy for reverse genetics and gene correction approaches. While off-target effects have been successfully tackled, the effort to eliminate variability in sgRNA efficacies-which affect experimental sensitivity-is in its infancy. To address this issue, studies have analyzed the molecular features of highly active sgRNAs, but independent cross-validation is lacking. Utilizing fluorescent reporter knock-out assays with verification at selected endogenous loci, we experimentally quantified the target efficacies of 430 sgRNAs. Based on this dataset we tested the predictive value of five recently-established prediction algorithms. Our analysis revealed a moderate correlation (r = 0.04 to r = 0.20) between the predicted and measured activity of the sgRNAs, and modest concordance between the different algorithms. We uncovered a strong PAM-distal GC-content-dependent activity, which enabled the exclusion of inactive sgRNAs. By deriving nine additional predictive features we generated a linear model-based discrete system for the efficient selection (r = 0.4) of effective sgRNAs (CRISPRater). We proved our algorithms' efficacy on small and large external datasets, and provide a versatile combined on- and off-target sgRNA scanning platform. Altogether, our study highlights current issues and efforts in sgRNA efficacy prediction, and provides an easily-applicable discrete system for selecting efficient sgRNAs.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Evaluation of CRISPR–Cas9 cleavage utilizing a fluorescent based reporter-assay. (A) Schematic presentation of the reporter construct. CRISPR–Cas9 target sites are integrated into the ORF of a sfGFP (super-folder GFP) cDNA. Puromycin resistance enables selection of HEL cells harboring the reporter construct. Genome editing generates frame-shift mutations leading to quantifiable loss of fluorescence. (B, C) Representative flow cytometry analyses of a reporter assay. X-axis: sfGFP reporter fluorescence; Y-axis: fluorescence of the provided CRISPR–Cas9 components, namely, a non-targeting sgRNA (B) and a targeting sgRNA (C). (D) An example of T7-endonuclease I assay results, and correlation of genomic modification efficacies at endogenous loci with results from the reporter assay. [M] DNA marker; [+] targeting sgRNA; [–] non-targeting sgRNA. (E) Correlation of reporter assay efficacies (x-axis) and T7-endonuclease I assay efficacies (y-axis). Pearson correlation (r) and P value (p) are indicated.
Figure 2.
Figure 2.
Experimentally-ascertained cutting efficacies of 430 sgRNAs can be partially predicted by five up-to-date online prediction tools. (A) Distribution of the individually-assessed cleavage efficacies of 430 sgRNAs targeting a total of 92 genes (54 human and 38 murine), calculated on the basis of reporter assay analyses. The dataset shows median cleavage efficacy of 76.1% (interquartile range: 52.9 to 85.8%). (B–F) Scatter plots showing the correlation between sgRNA-specific cleavage efficacies obtained from reporter assays (y-axes) and predicted scores obtained from sgRNA sequence analysis using the online tools sgRNA Designer (rule set I) (22) (B), sgRNA Designer (rule set II) (29) (C), sgRNA Scorer (24) (D), SSC score (27) (E) and CRISPRscan (26) (F) (x-axes). The Pearson correlations (r) and P values (p) are indicated.
Figure 3.
Figure 3.
Investigation of sgRNA sequence features capable of increasing genome editing capacity. (A) Efficacies of sgRNAs subdivided based on nucleotide usage at position 20 (adjacent to the PAM). Efficacy increased by 9.2% and 8.8% for G versus T and G versus C, respectively, and both are statistically significant by the Mann–Whitney test. (B) Comparison of sgRNA efficacy based on genomic origin of the target site. The sgRNAs target intronic regions (median 69.0%, interquartile range: 40.5–80.6%, n = 154), exonic regions (median 78.7%, interquartile range: 55.1–87.1%, n = 222) and promoter regions (median 80.7%, interquartile range: 74.8–87.2%, n = 54). Mann-Whitney test results are as indicated. (C–F) Depiction of GC-content dependent sgRNA activity. Overall GC: no significant differences (C). 1.78-fold to 1.87-fold reduction of efficacy in sgRNAs with <25% GC within nt 1–10 (n = 14) (D). GC-content has no effect on activity within nt 11–20 (E). Narrowed-down GC window size (5 nt) of PAM-distal GC-content: 1.36–1.41-fold reduction of efficacy with <25% GC within nt 4–8 (n = 50) (F). *P ≤ 0.05; **P ≤ 0.01; ***P ≤ 0.001; ****P ≤ 0.0001; Mann–Whitney test used in all cases.
Figure 4.
Figure 4.
CRISPRater: a 10-feature-based algorithm capable of predicting sgRNA activity via a discrete model. (A) Distribution of sgRNA efficacies within the reporter assay dataset. sgRNAs defined as low-efficiency (<40% efficacy, n = 70) and high-efficiency (>70%, n = 265) were segregated. (B) Average GC-content compared between high-efficiency and low-efficiency sgRNAs. The high-efficiency group displayed a higher overall GC-content from nt 4 to nt 13. (C) Most potent sequence features modulating sgRNA efficacy (five positively- and five negatively-modulating) extracted from 1024 features by individual feature weight. (D) Boxplot showing separation of the sgRNA dataset into low-efficiency (score <0.56, n = 52), medium-efficiency (score 0.56–0.74, n = 274) and high-efficiency (score >0.74, n = 100) groups according to discrete CRISPRater modeling, thereby excluding 12.2% of sgRNAs as low-efficiency (****P ≤ 0.0001 by use of Mann–Whitney test). (E) Validation of CRISPRater on 65 subsequently-designed sgRNAs not used to train the algorithm. The scatter plot shows a positive correlation between CRISPRater scores and measured sgRNA efficacies. Pearson correlation (r) and P value (p) are indicated.
Figure 5.
Figure 5.
Predictivity validation of the CRISPRater score based on external experimentally-assessed cutting efficacies. (A) Validation of CRISPRater on a combined sgRNA dataset (n = 3141 sgRNAs) derived by Xu et al., proving its predictive capacity (****P ≤ 0.0001 by use of Mann–Whitney test). (B and C) Scatter plots showing correlations of CRISPRater predicted cutting efficacies (x-axes) and experimentally-tested efficacies from Xu et al. (27) (y-axes). Positive correlations can be seen with n = 20 efficacies tested on the genomic level via the Surveyor assay (B), and with n = 15 efficacies tested on the protein expression level (C). Pearson correlations (r) and P values (p) are indicated.

References

    1. Jinek M., Chylinski K., Fonfara I., Hauer M., Doudna J.A., Charpentier E.. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012; 337:816–821. - PMC - PubMed
    1. Cong L., Ran F.A., Cox D., Lin S., Barretto R., Habib N., Hsu P.D., Wu X., Jiang W., Marraffini L.A. et al. . Multiplex genome engineering using CRISPR/cas systems. Science. 2013; 339:819–823. - PMC - PubMed
    1. Mali P., Yang L., Esvelt K.M., Aach J., Guell M., DiCarlo J.E., Norville J.E., Church G.M.. RNA-guided human genome engineering via Cas9. Science. 2013; 339:823–826. - PMC - PubMed
    1. Fu Y., Foden J.A., Khayter C., Maeder M.L., Reyon D., Joung J.K., Sander J.D.. High-frequency off-target mutagenesis induced by CRISPR-cas nucleases in human cells. Nat. Biotechnol. 2013; 31:822–826. - PMC - PubMed
    1. Hsu P.D., Scott D.A., Weinstein J.A., Ran F.A., Konermann S., Agarwala V., Li Y., Fine E.J., Wu X., Shalem O. et al. . DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 2013; 31:827–832. - PMC - PubMed

Publication types

MeSH terms

Substances