Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 12:11:e77825.
doi: 10.7554/eLife.77825.

Closely related type II-C Cas9 orthologs recognize diverse PAMs

Affiliations

Closely related type II-C Cas9 orthologs recognize diverse PAMs

Jingjing Wei et al. Elife. .

Abstract

The RNA-guided CRISPR/Cas9 system is a powerful tool for genome editing, but its targeting scope is limited by the protospacer-adjacent motif (PAM). To expand the target scope, it is crucial to develop a CRISPR toolbox capable of recognizing multiple PAMs. Here, using a GFP-activation assay, we tested the activities of 29 type II-C orthologs closely related to Nme1Cas9, 25 of which are active in human cells. These orthologs recognize diverse PAMs with variable length and nucleotide preference, including purine-rich, pyrimidine-rich, and mixed purine and pyrimidine PAMs. We characterized in depth the activity and specificity of Nsp2Cas9. We also generated a chimeric Cas9 nuclease that recognizes a simple N4C PAM, representing the most relaxed PAM preference for compact Cas9s to date. These Cas9 nucleases significantly enhance our ability to perform allele-specific genome editing.

Keywords: CRISPR/Cas9; Nme1Cas9 orthologs; Nsp2Cas9; PAM diversity; crispr/cas9; genetics; genome editing; genomics; nme1cas9 orthologs; nsp2cas9; pam diversity.

PubMed Disclaimer

Conflict of interest statement

JW, LH, JL, ZW, TQ, SG, SS No competing interests declared, SG, YW author on patent application number 202110878452X, "Cas9 protein, gene editing system containing Cas9 protein and application"; this patent relates to the technical field of gene editing, and specifically designs a CRISPR/Cas9 gene editing system and its application

Figures

Figure 1.
Figure 1.. Screening of Nme1Cas9 orthologs activities through a GFP- activation assay.
(A) Schematic of the GFP-activation assay. A protospacer flanked by an 8 bp random sequence is inserted between the ATG start codon and GFP-coding sequence, resulting in a frameshift mutation. The library DNA is stably integrated into HEK293T cells via lentivirus infection. Genome editing can lead to in-frame mutation. The protospacer sequence is shown below. (B) The procedure of the GFP-activation assay. Cas9 and sgRNA expression plasmids were co-transfected into the reporter cells. GFP-positive cells could be observed if the protospacer is edited. (C) Twenty-five out of 29 Nme1Cas9 orthologs could induce GFP expression. The percentage of GFP-positive cells is shown. Reporter cells without Cas9 transfection are used as a negative control. Scale bar: 250 μm.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Phylogenetic tree of the selected Nme1Cas9 orthologs.
The amino acid sequences of 29 selected Nme1Cas9 orthologs were aligned by Vector NTI. Nme1Cas9, Nme2Cas9, and Nme3Cas9 were used as reference and shown in green.
Figure 1—figure supplement 2.
Figure 1—figure supplement 2.. Alignment of the PI domain of Nme1Cas9 orthologs.
The Nme1Cas9 orthologs contained aspartate (A), histidine (B), or asparagine (C) residues corresponding to the Nme1Cas9 H1024. The PI domains were aligned by Vector NTI. Amino acids crucial for PAM recognition are shown above. Nme1Cas9, Nme2Cas9, and Nme3Cas9 were used as reference and shown in green.
Figure 1—figure supplement 3.
Figure 1—figure supplement 3.. The alignment of direct repeats and tracrRNAs of Nme1Cas9 orthologs.
(A) Alignment of direct repeat sequences for Nme1Cas9 orthologs is shown. (B) Alignment of tracrRNAs for Nme1Cas9 orthologs. Sequence alignment revealed that direct repeats and the 5’ end of tracrRNAs were conserved among Nme1Cas9 orthologs. Strict identical residues are highlighted with the red background and conserved mutations are highlighted with an outline and red font.
Figure 1—figure supplement 4.
Figure 1—figure supplement 4.. Single-guide RNA (sgRNA) scaffolds of Nme1Cas9 orthologs.
In silico co-folding of the crRNA direct repeat and putative tracrRNA shows stable secondary structure and complementarity between the two RNAs.
Figure 1—figure supplement 5.
Figure 1—figure supplement 5.. Protein expression levels of Nme2Cas9 orthologs were analyzed by western blot.
HEK293T cells without Cas9 transfection were used as a negative control (NC).
Figure 2.
Figure 2.. PAM analysis for each Cas9 nuclease.
(A) Example of indel sequences measured by deep sequencing for Nsp2Cas9. The GFP coding sequences are shown in green; an 8 bp random sequence is shown in orange; black dashes indicate deleted bases; red bases indicate insertion mutations. (B) The PAM WebLogos for Nme1Cas9 orthologs containing an aspartate residue corresponding to the Nme1Cas9 H1024. PAM positions for each WebLogo are shown below. The PAM WebLogos for Nme2Cas9, Nsp2Cas9, PutCas9, SmuCas9, NarCas9, PstCas9 are generated from the first round of PAM screening and the PAM WebLogos for others are generated from the second round of PAM screening. PAM positions in the screening assay are shown on the bottom right. (C) The PAM WebLogos for Nme1Cas9 orthologs containing histidine, or asparagine residues corresponding to the Nme1Cas9 H1024. PAM positions for each WebLogo are shown below. The PAM WebLogo for Nan2Cas9 is generated from the first round of PAM screening and the PAM WebLogos for others are generated from the second round of PAM screening.
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. PAM wheels for Nme1Cas9 orthologs.
(A) PAM wheels for Nme1Cas9 orthologs containing an aspartate residue corresponding to the Nme1Cas9 H1024. PAM positions in the screening assay are shown on the bottom right. (B) PAM wheels for Nme1Cas9 orthologs containing histidine, or asparagine residues corresponding to the Nme1Cas9 H1024. PAM wheels start in the middle of the wheel for the first 5’ base exhibiting sequence information.
Figure 2—figure supplement 2.
Figure 2—figure supplement 2.. The specificity between amino acids and bases in calculated structural models.
(A) Calculated structural model of Bdecas9. The amino acid near the 5 position of the NTS is histidine. Histidine’s side chain forms a potential hydrogen bond with the 6-hydroxyl group of guanine, if the guanine is other bases, this hydrogen bond would be not formed because of the too close distance (cytosine or thymine) or the lack of a hydroxyl group (adenine). (B) Calculated structural model of Asucas9. The amino acid near the 5 position of NTS is aspartic, and it forms a potential hydrogen bond with the 4-amine group of the cytosine. if the cytosine is replaced by other bases, this hydrogen bond would be abolished because of increased distance (adenine or guanine) or the lack of an amine group (thymine). (C) Calculated structural model of Bdecas9. The amino acid near the 8 position of TS is glutamine, and it forms a potential hydrogen bond with the 6-amine group of the adenine on the TS. If the cytosine is replaced by other bases, this hydrogen bond would be not formed because of too close distance (cytosine or thymine) or the lack of an amine group (guanine). TS: target strand, NTS: non-target strand.
Figure 3.
Figure 3.. Nsp2Cas9 enables editing in HEK293T cells.
(A) Schematic of Cas9 and sgRNA expression constructs. U1A: U1A promoter; pA: polyA; NLS: nuclear localization signal; HA: HA tag. (B) Protein expression levels of Nsp2Cas9 and Nme2Cas9 were analyzed by Western blot. HEK293T cells without Cas9 transfection were used as negative control (NC). (C) Comparison of Nsp2Cas9 and Nme2Cas9 editing efficiencies at 19 endogenous loci in HEK293T cells. Data represent mean ± SD for n=3 biologically independent experiments. (D) Quantification of the indel efficiencies for Nsp2Cas9 and Nme2Cas9. Each dot represents an average efficiency for an individual locus. Data represent mean ± SD for n=3 biologically independent experiments. P values were determined using a two-sided Student’s t test. P=0.7486 (P>0.05), ns stands for not significant.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. The effect of spacer length on the efficiency of Nsp2Cas9 editing.
A single G5 site on the GRIN2B gene was targeted by sgRNAs with spacer lengths varying from 18 to 26 nt.
Figure 3—figure supplement 2.
Figure 3—figure supplement 2.. Nsp2Cas9 enables editing in different mammalian cells.
Nsp2Cas9 enables editing in HeLa (A), HCT116 (B), A375 (C), SH-SY5Y (D) and mouse N2a cells (E). Data represent mean ± SD (n=3).
Figure 3—figure supplement 3.
Figure 3—figure supplement 3.. Rational engineering of Nsp2Cas9.
(A) Schematic of the GFP-activation reporter construct for testing engineered Nsp2Cas9 activity. The protospacer sequence is shown below. (B) GFP-positive cells induced by the engineered Nsp2Cas9 variants. Data represent mean ± SD (n=3).
Figure 3—figure supplement 4.
Figure 3—figure supplement 4.. NarCas9 enables genome editing in mammalian cells.
(A) Schematic of Cas9 and sgRNA expression constructs. U1A: U1A promoter; pA: polyA; NLS: nuclear localization signal; HA: HA tag. (B) NarCas9 enables genome editing in HEK293T cells. Data represent mean ± SD (n=3). (C) NarCas9 enables genome editing in HeLa cells. Data represent mean ± SD (n=3).
Figure 4.
Figure 4.. Characterization of Nsp2-SmuCas9 for genome editing.
(A) Schematic diagram of chimeric Cas9 nucleases based on Nsp2Cas9. PI domain of Nsp2Cas9 was replaced with the PI domain of SmuCas9. (B) Sequence logos and (C) PAM wheel diagrams indicate that Nsp2-SmuCas9 recognizes an N4C PAM. (D) Nsp2-SmuCas9 generated indels at endogenous sites with N4C PAMs in HEK293T cells. Indel efficiencies were determined by targeted deep sequencing. NarCas9 is used as a control. Data represent mean ± SD for n=3 biologically independent experiments. (E) Quantification of the indel efficiencies for Nsp2-SmuCas9 and NarCas9. Each dot represents an average efficiency for an individual locus. Data represent mean ± SD for n=3 biologically independent experiments. p values were determined using a two-sided Student’s t test. *p=0.0148 (0.01<p < 0.05).
Figure 4—figure supplement 1.
Figure 4—figure supplement 1.. Test of 4 chimeric Cas9 activity through a GFP-activation assay.
(A) Nsp2-SmuCas9 and Nme2-SmuCas9 could induce GFP expression. Reporter cells without Cas9 transfection are used as a negative control. Scale bar: 250 μm. (B) Sequence logo and (C) PAM wheel diagram indicate that Nme2-SmuCas9 recognizes an N4C PAM. (D) Nme2-SmuCas9 generated indels at endogenous sites with N4C PAMs in HEK293T cells. Indel efficiencies were determined by targeted deep sequencing. Data represent mean ± SD for n=3 biologically independent experiments.
Figure 4—figure supplement 2.
Figure 4—figure supplement 2.. Structure of the fully complementary Cas9-sgRNA-dsDNA complex in a catalytic state.
(A) An alignment of the protein tertiary structures of Nsp2-NarCas9 and Nme1Cas9-sgRNA-dsDNA complexes. Blue represents the Nme1Cas9 protein, and green represents the PI domain of NarCas9, PID: PAM identify domain. (B) Align of Nme1Cas9 and Nsp2-SmuCas9. Blue represents the Nme1Cas9 protein, orange represents the Nsp2Cas9 protein, and red represents the PI domain of SmuCas9. (C) Align of Nme1Cas9 and NarCas9. The green represents the NarCas9 protein. TS: target strand, NTS: non-target strand.
Figure 4—figure supplement 3.
Figure 4—figure supplement 3.. Calculated structural models of Nme2-NarCas9 and Nme2-SmuCas9 chimeras.
(A) Calculated structural models of Nme2-NarCas9 and Nme2-SmuCas9 chimeras. In the inactive Nme2-NarCas9 chimera (magenta), R1052 will crash with the DNA strand, leading to a failure of binding with DNA. In the active Nme2-SmuCas9 chimera (salmon), R1052 will also crash with the DNA strand. (B) Overall calculated structures of Nme2-NarCas9 (magenta) and Nme2-SmuCas9 (salmon) chimeras with sgRNA and dsDNA. TS: target strand, NTS: non-target strand.
Figure 5.
Figure 5.. Comparison of indel efficiency between Nsp2Cas9, Nsp2-SmuCas9, and SpCas9.
(A) Schematic of Cas9 and sgRNA expression constructs. U1A: U1A promoter; pA: polyA; NLS: nuclear localization signal; HA: HA tag. (B) Protein expression levels of Nsp2Cas9, Nsp2-SmuCas9, and SpCas9 were analyzed by western blot. HEK293T cells without Cas9 transfection were used as a negative control (NC). (C) The editing efficiencies of Nsp2Cas9, Nsp2-SmuCas9, and SpCas9 varied depending on the target sites. The PAM sequences (NGGNCC) were shown below. Data represent mean ± SD for n=3 biologically independent experiments. (D) Quantification of the indel efficiencies for Nsp2Cas9, Nsp2-SmuCas9, and SpCas9. Each dot represents an average efficiency for an individual locus. Data represent mean ± SD for n=3 biologically independent experiments. p values were determined using a two-sided Student’s t test. p=0.3883, p=0.7316, p=0.0741 (p>0.05), ns stands for not significant. *p=0.0247, *p=0.0144, (0.01<p < 0.05), * stands for significant. **p=0.0058 (p<0.01), ** stands for significant.
Figure 6.
Figure 6.. Analysis of Nsp2Cas9 specificity.
(A) Analysis of Nsp2Cas9 and Nme2Cas9 specificity with a GFP-activation assay. A panel of sgRNAs with dinucleotide mutations (red) is shown below. The editing efficiencies reflected by ratio of GFP-positive cells are shown. Data represent mean ± SD for n=3 biologically independent experiments. (B) GUIDE-seq was performed to analyze the genome-wide off-target effects of Nsp2Cas9 and Nme2Cas9. On-target (indicated by stars) and off-target sequences are shown on the left. Read numbers are shown on the right. Mismatches compared to the on-target site are shown and highlighted in color.
Figure 7.
Figure 7.. Analysis of Nsp2-SmuCas9 specificity.
(A) Analysis of Nsp2-SmuCas9 specificity with a GFP-activation assay. A panel of sgRNAs with dinucleotide mutations (red) is shown below. The editing efficiencies reflected by ratio of GFP-positive cells are shown. Data represent mean ± SD for n=3 biologically independent experiments. (B) GUIDE-seq was performed to analyze the genome-wide off-target effects of Nsp2-SmuCas9. On-target (indicated by stars) and off-target sequences are shown on the left. Read numbers are shown on the right. Mismatches compared to the on-target site are shown and highlighted in color.

Similar articles

Cited by

References

    1. Agudelo D, Carter S, Velimirovic M, Duringer A, Rivest J-F, Levesque S, Loehr J, Mouchiroud M, Cyr D, Waters PJ, Laplante M, Moineau S, Goulet A, Doyon Y. Versatile and robust genome editing with streptococcus thermophilus CRISPR1-cas9. Genome Research. 2020;30:107–117. doi: 10.1101/gr.255414.119. - DOI - PMC - PubMed
    1. Anzalone AV, Randolph PB, Davis JR, Sousa AA, Koblan LW, Levy JM, Chen PJ, Wilson C, Newby GA, Raguram A, Liu DR. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature. 2019;576:149–157. doi: 10.1038/s41586-019-1711-4. - DOI - PMC - PubMed
    1. Chatterjee P., Jakimo N, Jacobson JM. Minimal PAM specificity of a highly similar spcas9 ortholog. Science Advances. 2018;4:eaau0766. doi: 10.1126/sciadv.aau0766. - DOI - PMC - PubMed
    1. Chatterjee P, Lee J, Nip L, Koseki SRT, Tysinger E, Sontheimer EJ, Jacobson JM, Jakimo N. A cas9 with PAM recognition for adenine dinucleotides. Nature Communications. 2020;11:2474. doi: 10.1038/s41467-020-16117-8. - DOI - PMC - PubMed
    1. Collias D, Beisel CL. CRISPR technologies and the search for the PAM-free nuclease. Nature Communications. 2021;12:555. doi: 10.1038/s41467-020-20633-y. - DOI - PMC - PubMed

Publication types

LinkOut - more resources