Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 29;12(1):e1004724.
doi: 10.1371/journal.pcbi.1004724. eCollection 2016 Jan.

A Biophysical Model of CRISPR/Cas9 Activity for Rational Design of Genome Editing and Gene Regulation

Affiliations

A Biophysical Model of CRISPR/Cas9 Activity for Rational Design of Genome Editing and Gene Regulation

Iman Farasat et al. PLoS Comput Biol. .

Abstract

The ability to precisely modify genomes and regulate specific genes will greatly accelerate several medical and engineering applications. The CRISPR/Cas9 (Type II) system binds and cuts DNA using guide RNAs, though the variables that control its on-target and off-target activity remain poorly characterized. Here, we develop and parameterize a system-wide biophysical model of Cas9-based genome editing and gene regulation to predict how changing guide RNA sequences, DNA superhelical densities, Cas9 and crRNA expression levels, organisms and growth conditions, and experimental conditions collectively control the dynamics of dCas9-based binding and Cas9-based cleavage at all DNA sites with both canonical and non-canonical PAMs. We combine statistical thermodynamics and kinetics to model Cas9:crRNA complex formation, diffusion, site selection, reversible R-loop formation, and cleavage, using large amounts of structural, biochemical, expression, and next-generation sequencing data to determine kinetic parameters and develop free energy models. Our results identify DNA supercoiling as a novel mechanism controlling Cas9 binding. Using the model, we predict Cas9 off-target binding frequencies across the lambdaphage and human genomes, and explain why Cas9's off-target activity can be so high. With this improved understanding, we propose several rules for designing experiments for minimizing off-target activity. We also discuss the implications for engineering dCas9-based genetic circuits.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The multi-step mechanism responsible for Cas9-mediated DNA site cleavage.
(A) Each crRNA strand is expressed with rate rcrRNA. The active crRNA is formed by either hybridization of an expressed tracrRNA with an expressed precrRNA or by direct expression of a single guide RNA (sgRNA). The Cas9 endonuclease is expressed with rate rCas9. (B) Cas9 binds to the mature crRNA with a forward kinetic association constant kf. After loading the crRNA, the structure of the Cas9:crRNA undergoes an isomerization with forward kinetic constant kI to create an active complex. NcrRNA, NCas9, Nintermediate, and NCas9:crRNA are their numbers of molecules. (C) The resulting active complex performs a 3D random walk with molar flow rate rRW. The probability that it binds to a DNA site is determined by the site sequence, including the presence of a protospacer adjacent motif (PAM), the number of same-sequence DNA sites (Ntarget, j), and their binding free energy (ΔGtarget, j). (D) The formation of a stable Cas9:crRNA:DNA complex occurs in several steps: Cas9:crRNA recognizes the PAM site, unwinds the DNA duplex, and sequentially replaces DNA:DNA base pairings with RNA:DNA bases pairings in an exchange reaction to form a DNA:RNA:DNA complex, called an R-loop. The DNA target site's binding free energy to Cas9:crRNA (ΔGtarget) sums together its PAM interaction energy (ΔGPAM), the energy needed to unwind the supercoiled DNA (ΔΔGsupercoiling), and the crRNA:DNA exchange energy during R-loop formation (ΔΔGexchange). During these steps, the Cas9:crRNA:DNA complex may dissociate with first order kinetic constant kd or it may be cleave the bound DNA site with pseudo first order kinetic constant kC. (E) After cleavage, the Cas9:crRNA:DNA complex remains bound to the cleaved DNA, and is considered a no-turnover enzyme. Additional model parameters include the DNA replication rate (μ) and the degradation or dilution rates of Cas9 (δCas9), crRNA (δcrRNA), and Cas9:crRNA complex (δCas9:crRNA).
Fig 2
Fig 2. Parameterization of the model using in vitro data.
Equimolar mixtures of Cas9 and crRNA (concentrations shown) were pre-incubated for 10 minutes, followed by the addition of target DNA and measuring the amount of cleaved DNA. Normalized cleaved DNA measurements (orange circles) using 25 nM negatively supercoiled plasmid DNA are compared to normalized model-calculated amounts of cleaved DNA (lines). Data points represent single measurements from Sternberg et al. [38].
Fig 3
Fig 3. Parameterization of the model using in vivo data.
(A) The addition of target DNA sites with the same sequence sequesters the Cas9:crRNA complex, and increases the transcription rate of the promoter controlling YFP expression. (B) A comparison between model-calculated transcription rates and measured YFP expression levels when either (stars) 0, (circles) 1, (diamonds) 2, (squares) 4, or (triangles) 8 additional on-target DNA sites were added. The DNA sites’ initial superhelical densities were either (left) increased by 0.0065 per occupied site or (right) kept constant. Data points and bars represent the mean and standard deviation of 2 measurements, performed in this study.
Fig 4
Fig 4. Parameterized free energy models show how mismatched crRNA guide sequences and DNA site sequences affect Cas9 cleavage activity.
The (A) 21 position-dependent and (B) 256 sequence-dependent free energy model coefficients were determined using either (left) 3671 in vitro Cas9 cleavage measurements from dataset I or the (right) 5979 in vivo Cas9 cleavage measurements from dataset II. Coefficients were normalized to their maximum values. White boxes show unidentifiable model parameters, based on the available measurements. (C) Comparisons between apparent and model-calculated ΔΔGexchange across all single measurements. Pearson R2 is 0.74 and 0.61, respectively. All points represent single measurements from Pattanayak et. al., Hsu et. al., and Mali et. al [33,37,41]. (D) An example showing how the model is used to calculate ΔΔGexchange and ΔGPAM for a specific guide RNA sequence and DNA site. The energetic contributions of the three mismatches are determined by their (A) position-dependent coefficients and their (B) dinucleotide RNA:DNA identities, using the model parameterized by (left) dataset I. The (green box) PAM sequence determines ΔGPAM using Table 3.
Fig 5
Fig 5. Calculation of dCas9:crRNAλ2 binding occupancy across 34,363 PAM sites on a λ-phage genome.
(A) Model-calculated target binding free energies (ΔGtarget) are shown across genome position, plotting only one in ten positions for improved visualization. Panels represent either the (top, blue) forward strand or (bottom, red) reverse strand of the λ-phage genome. The target binding free energies are the sum of (B) the free energy change when dCas9 binds to a PAM site (ΔGPAM), (C) the free energy change during R-loop formation at PAM-proximal sites, compared to a perfectly complementary sequence (ΔΔGexchange), and the free energy change as a result of varying DNA site superhelical density (ΔΔGsupercoiling). The major on-target site λ2 is denoted by stars. A major off-target site OS1 is denoted by crosses. Here, each mismatch in the crRNA and DNA site sequences contributes up to 0.78 kcal/mol to ΔΔGexchange, depending on their distance from the PAM site. The λ-phage genome is assumed to have uniform DNA superhelical density. The model-calculated binding probabilities of (d)Cas9:crRNAλ2 to all possible PAM sites are shown at (D) the initial time before any Cas9 activity or (F) after a 10 minute incubation with (d)Cas9:crRNAλ2. (E) We show the model-calculated dynamics of (d)Cas9 binding occupancy at the (black line) λ2 DNA site, the (green line) major off-target site OS1, and a (inset) single off-target site with ΔGtarget = 0 kcal/mol.
Fig 6
Fig 6. Model predictions for human genome editing.
(A) Model-calculated distributions show the numbers of human genome DNA sites that will be cleaved with varying efficiencies when using a LTR-B crRNA with either (yellow) baseline, (blue) 10-fold lower, or (green) 10-fold higher Cas9 and crRNA concentrations. (B) The expected number of off-target indel mutations when counting sites with cleavage efficiencies higher than a cut-off value. (C) The required next-generation sequencing coverage to identify the expected number of off-target indel mutations with 99% certainty. Colors same as in A. (D) The model-calculated dynamics of human genome modification under the same three scenarios, comparing (solid lines) on-target cleavage versus (dashed lines) the ratio between on-target and total off-target cleavage (specificity).
Fig 7
Fig 7. Rational design of genome editing and gene regulation.
(A) The dynamics of Cas9-based cleavage at DNA sites with either (blue) zero, (green) one, (red) two, or (cyan) three mismatches, comparing the effects of increasing guide RNA concentration by 10-fold, increasing the genome size by 2-fold, or increasing the cellular growth rate by 2-fold. (B) A sensitivity analysis shows how changing system parameters affect a DNA site’s steady-state cleavage efficiency in growing cells. (C) The dynamics of dCas9-based transcriptional repression (promoter activity) at DNA sites with either (blue) zero, (green) one, (red) two, or (cyan) three mismatches, performing the same comparisons as in A. (D) A sensitivity analysis shows how changing system parameters affect a DNA sites’ steady-state transcriptional repression (promoter activity) in growing cells. mm, mismatch.

Similar articles

Cited by

References

    1. Cong L, Ran FA, Cox D, Lin S, Barretto R, et al. (2013) Multiplex genome engineering using CRISPR/Cas systems. Science 339: 819–823. 10.1126/science.1231143 - DOI - PMC - PubMed
    1. Deltcheva E, Chylinski K, Sharma CM, Gonzales K, Chao Y, et al. (2011) CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Nature 471: 602–607. 10.1038/nature09886 - DOI - PMC - PubMed
    1. Jiang W, Bikard D, Cox D, Zhang F, Marraffini LA (2013) RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nature biotechnology 31: 233–239. 10.1038/nbt.2508 - DOI - PMC - PubMed
    1. Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, et al. (2012) A programmable dual-RNA–guided DNA endonuclease in adaptive bacterial immunity. Science 337: 816–821. 10.1126/science.1225829 - DOI - PMC - PubMed
    1. Qi LS, Larson MH, Gilbert LA, Doudna JA, Weissman JS, et al. (2013) Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152: 1173–1183. 10.1016/j.cell.2013.02.022 - DOI - PMC - PubMed

Publication types

MeSH terms