Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr;616(7956):390-397.
doi: 10.1038/s41586-023-05933-9. Epub 2023 Apr 5.

Cryo-EM structure of the transposon-associated TnpB enzyme

Affiliations

Cryo-EM structure of the transposon-associated TnpB enzyme

Ryoya Nakagawa et al. Nature. 2023 Apr.

Abstract

The class 2 type V CRISPR effector Cas12 is thought to have evolved from the IS200/IS605 superfamily of transposon-associated TnpB proteins1. Recent studies have identified TnpB proteins as miniature RNA-guided DNA endonucleases2,3. TnpB associates with a single, long RNA (ωRNA) and cleaves double-stranded DNA targets complementary to the ωRNA guide. However, the RNA-guided DNA cleavage mechanism of TnpB and its evolutionary relationship with Cas12 enzymes remain unknown. Here we report the cryo-electron microscopy (cryo-EM) structure of Deinococcus radiodurans ISDra2 TnpB in complex with its cognate ωRNA and target DNA. In the structure, the ωRNA adopts an unexpected architecture and forms a pseudoknot, which is conserved among all guide RNAs of Cas12 enzymes. Furthermore, the structure, along with our functional analysis, reveals how the compact TnpB recognizes the ωRNA and cleaves target DNA complementary to the guide. A structural comparison of TnpB with Cas12 enzymes suggests that CRISPR-Cas12 effectors acquired an ability to recognize the protospacer-adjacent motif-distal end of the guide RNA-target DNA heteroduplex, by either asymmetric dimer formation or diverse REC2 insertions, enabling engagement in CRISPR-Cas adaptive immunity. Collectively, our findings provide mechanistic insights into TnpB function and advance our understanding of the evolution from transposon-encoded TnpB proteins to CRISPR-Cas12 effectors.

PubMed Disclaimer

Conflict of interest statement

F.Z. is a co-founder of Editas Medicine, Beam Therapeutics, Pairwise Plants, Arbor Biotechnologies and Sherlock Biosciences. J.v.d.O. is a co-founder of NTrans Technologies and a scientific advisor for NTrans Technologies, Scope Biosciences and Hudson River Biotechnology. O.N. is a co-founder, board member and scientific advisor for Curreio. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Cryo-EM structure of the TnpB–ωRNA–target DNA ternary complex.
a, Schematic of the D. radiodurans ISDra2 locus. The MGE consists of the tnpA and tnpB genes flanked by the left end (LE) and right end (RE) elements of the transposon. The ωRNA is derived from the 3′ end of the tnpB gene and the RE element. b, Diagram of the ωRNA and target DNA used for cryo-EM analysis. The target strand (TS) and non-target strand (NTS) each comprise 35 nucleotides, and the non-target strand contains a TTGAT TAM sequence. The 247-nt ωRNA was co-expressed and co-purified with TnpB. Nucleotides −231 to −117, −70 to −49, −20 to −17 and 13 to 16 of the ωRNA, nucleotides −8 to 4 and 27 of the target strand, and nucleotides −11* and 1* to 24* of the non-target strand were not included in the final model. c, The domain structure of TnpB. CTD, C-terminal domain. Residues 281 to 296 and 379 to 408 were not included in the final model. d, Cryo-EM density map of the TnpB–ωRNA–target DNA complex. e, The overall structure of the TnpB–ωRNA–target DNA complex. Disordered regions are indicated as dotted lines.
Fig. 2
Fig. 2. ωRNA architecture.
a, Schematic of the ωRNA and target DNA. Nucleotides −231 to −117, −70 to −49, −20 to −17, and 13 to 16 of the ωRNA, nucleotides −8 to 4 and 27 of the target strand, and nucleotides −11* and 1* to 24* of the non-target strand were disordered and not included in the model. These disordered regions are enclosed in grey boxes. b, Structure of the ωRNA scaffold. The disordered regions are indicated as dotted lines. c, Insertion–deletion mutation (indel) formation efficiencies of TnpB with wild-type (WT) ωRNA, ωRNA with deleted 5′ region (Trim1), ωRNA lacking both 5′ region and stem 3b (Trim2) and non-targeting ωRNA (NT) at seven endogenous target sites in HEK293FT cells. Data are mean ± s.d. (n= 3 biologically independent samples). Source Data
Fig. 3
Fig. 3. ωRNA recognition.
a, Recognition of the ωRNA scaffold by the TnpB protein. TnpB is shown as a surface model. The ωRNA scaffold is recognized mainly through the WED and RuvC domains. be, Recognition of the stem 1 and triple helix structure (b), stems 2 and 3a (c), stem 4 (d) and the PK architecture (e) of the ωRNA scaffold. Hydrogen bonding and electrostatic interactions are shown as dashed lines. Source Data
Fig. 4
Fig. 4. Target DNA recognition and loading.
a, Recognition of the target DNA. TnpB is shown as an electrostatic surface potential model. The TAM duplex is bound to the cleft formed by the WED and REC domains. The guide RNA–target DNA heteroduplex is accommodated within a positively charged central channel formed by the REC and RuvC domains. b,c, Recognition of the TAM duplex. Nucleotides −1*dT, 18dT and −4*dT and residues Y52, S56, F77 and T123 are depicted by space-filling models. Hydrogen bonding and electrostatic interactions are shown as dashed lines. d, In vitro DNA cleavage activities of the wild-type TnpB and TAM recognition mutants with ωRNA with deleted 5′ region. The 3-kb linearized target DNA containing the 16-nt target sequence was incubated with the TnpB–ωRNA complex (250 nM) at 37 °C for 30 min. The reaction products were resolved, visualized and quantified with a MultiNA microchip electrophoresis device. Data are mean ± s.d. (n= 3 biologically independent samples). The experiments were repeated three times with similar results. e,f, Recognition of the TAM-proximal region of the guide RNA–target DNA heteroduplex (e) and the TAM-distal region of the heteroduplex (f). Hydrogen bonding and electrostatic interactions are shown as dashed lines. g, Positions of the active site and the target DNA. The possible trajectories of the target strand are shown by a dashed arrow. h, In vitro DNA cleavage activities of TnpB with five target DNAs with different sequences at the re-hybridized DNA duplex. Re-hybridized regions with altered sequences are highlighted with a yellow background. CG and AT sequences are coloured blue and green, respectively. Data are mean ± s.d. (n= 3 biologically independent samples). The experiments were repeated three times with similar results.
Fig. 5
Fig. 5. Comparison of TnpB with Cas12f and Cas12m.
Structural comparison of TnpB with Cas12f (Cas12f from an uncultured archaeon) (PDB ID: 7C7L) and Cas12m (Cas12m from Mycobacterium mucogenicum) (PDB ID: 8HHL). Cas12f and Cas12m share high sequence similarity with TnpB proteins. Cas12f mol.2- and Cas12m-specific insertions (REC2) are highlighted in blue. These regions have a crucial role for the recognition of the PAM-distal region of the guide RNA–target DNA heteroduplex, suggesting that type V Cas12 enzymes acquired an ability to recognize the PAM-distal end of the guide RNA–target DNA heteroduplex in order to engage in CRISPR–Cas adaptive immunity. ZF, zinc-finger domain.
Extended Data Fig. 1
Extended Data Fig. 1. Multiple sequence alignment of TnpB orthologs.
Dra, TnpB from Deinococcus radiodurans (WP_010887311.1); Hsp, TnpB from Hydrococcus sp.RU_2_2 (NJM87737.1); Gdu, TnpB from Gloeocapsopsis dulcis (WP_105220324.1); Nap, TnpB from Nodularia sphaerocarpa (WP_239728827.1). The secondary structure of TnpB is indicated above the sequences. Key residues of TnpB are marked below the sequences by triangles. The figure was prepared using Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo) and ESPript3 (http://espript.ibcp.fr/ESPript/ESPript).
Extended Data Fig. 2
Extended Data Fig. 2. Single-particle cryo-electron microscopy analysis.
(a) Size-exclusion chromatography profile of the TnpBωRNA–target DNA complex. The peak fraction (indicated by a black bar) was analyzed by SDS-PAGE and urea-PAGE, and then used for cryo-EM analysis. (b) A representative cryo-EM image of the TnpBωRNA–target DNA complex, recorded on a 300 kV Titan Krios with a K3 camera. (c) Single-particle cryo-EM image processing workflow. (d) Fourier shell correlation (FSC) curve for the 3D reconstruction. The gold-standard cutoff (FSC = 0.143) is marked with the black dotted line. (e) Direct distribution plot (Viewing distribution plot). (f) Direction 3DFSC plots calculated by 3DFSC processing Server (https://3dfsc.salk.edu/upload/info/). (g) Local-resolution cryo-EM density map.
Extended Data Fig. 3
Extended Data Fig. 3. Cryo-EM density map.
Cryo-EM density maps for the guide RNA–target DNA heteroduplex (a), the TAM duplex (b), the pseudoknot structure (c), stems 2 and 3 (d), and stem 1 and the triple helix structure (e). The ambiguous density in (d) corresponds to stem 3b.
Extended Data Fig. 4
Extended Data Fig. 4. Domain structures.
(a) The structures of TnpB, Cas12f (Cas12f from an uncultured archaeon) (PDB ID: 7C7L), Cas12a (Cas12a from Francisella novicida) (PDB ID: 6I1K), and Cas12e (Cas12e from a Deltaproteobacterium, also known as CasX) (PDB ID: 6NY2) were aligned, based on the guide RNA–target DNA heteroduplex. TnpB and these type V Cas12 enzymes commonly adopt a bilobed architecture containing the REC and NUC lobes, which is structurally similar to that of the WED and RuvC domains, despite their limited sequence identity (the conserved α helices (red) and β strands (blue) are numbered). The REC lobes commonly consist of the WED and REC domains, and the WED domain comprises an OB fold (the conserved α helices (red) and β strands (blue) are numbered). The first α helices in their REC domains are located at similar positions (as indicated by red arrows). Cas12f has the zinc finger (ZF) domain inserted between the WED and REC domains. Cas12a has the PAM-interacting (PI) domain inserted into the WED domain. Cas12e has the non-target-strand binding (NTSB) domain inserted into the REC1 domain. The NUC lobes of TnpB consist of the RuvC, TNB, and CTD domains, although the CTD domain is disordered. The RuvC domains comprise an RNase H fold (the conserved α helices (red) and β strands (blue) are numbered). The TNB domains are inserted between the conserved strand β5 and helix α4 in the RuvC domains. The TNB domains share low sequence similarity and adopt distinct structures. (b) The TNB domains of TnpB and Cas12f. Both TnpB and Cas12f contain a typical CXXC---CXXC zinc finger motif, and each of which binds zinc ions.
Extended Data Fig. 5
Extended Data Fig. 5. Biochemical characterization of ωRNA.
(a) Urea-PAGE analysis of the purified TnpB–ωRNA complex. Although we co-expressed TnpB with a 247-nt ωRNA, the purified TnpB was bound to heterogenous 100–160-nt parts of the ωRNA. (b) The sequence of full-length ωRNA (−231G to 16C). The probe positions used in the northern blotting analysis are shown in orange. Two GAAC sites that could generate a pGAACp fragment by RNase A digestion are indicated in red. (c) Northern blotting of ωRNA. Total RNAs prepared from E. coli wild-type cells (lane 1), ωRNA expressing cells (lane 2), ωRNA and MBP co-expressing cells (lane 3), ωRNA and TnpB co-expressing cells (lane 4), the in vitro transcript of ωRNA (lane 5), and the ωRNA extracted from TnpB (lane 6) were resolved by 10% denaturing PAGE and stained with GelGreen (left panel) or subjected to northern blotting (right panels, Probes I–V). 6S RNA (180-nt), 5S rRNA (120-nt), tRNAs (76 to 93-nt) in total RNA and 50-nt, 100-nt, and 300-nt RNA markers are indicated (lane M). (d) Collision-induced dissociation spectrum of the pGAACp fragment from ωRNA digested by RNase A. The divalent negatively charged ion of pGAACp was used as the precursor ion for CID. The product ions in the CID spectrum are assigned on the sequence.
Extended Data Fig. 6
Extended Data Fig. 6. Overlapping region between tnpB gene and ωRNA.
(a) Schematic illustrating the overlapping region between the tnpB gene (residues 335 to 408) and ωRNA (−231G to −10U). The disordered regions of TnpB and ωRNA are shown as dotted arrows. Except for a few nucleotides (indicated by the red box), the functionally important regions of the tnpB gene and ωRNA do not overlap. (b) In vitro DNA cleavage assay of the wild-type (WT) TnpB, the 5′ region of the ωRNA-deleted (Δ−231G to −117T; Δ5′ region) mutant, and the C-terminal domain-deleted (Δ376 to 408; ΔCTD) mutant. The linearized plasmid target, containing a 16-nt target sequence and a TTGAT TAM sequence, was incubated with the TnpBωRNA complex at 37 °C for 30 min. The cleavage products were then analyzed by a MultiNA microchip electrophoresis system. (c) Quantification of the DNA cleavage data in (b). Data are mean ± s.d. (n= 3, biologically independent samples). The experiments were repeated three times with similar results. Source data are provided as a Source data file. (d) Thermal shift assay of the WT TnpB and the ΔCTD mutant, calculated by a NanoTemper Tycho NT.6 Differential Scanning Fluorimeter, which determines the inflection temperature (Ti) of samples. Source Data
Extended Data Fig. 7
Extended Data Fig. 7. ωRNA architecture and recognition.
(a) Structural comparison of the ωRNA scaffold of TnpB with the guide RNA scaffolds of Cas12f (PDB ID: 7C7L) and Cas12a (PDB ID: 6I1K). Cas12f associates with its cognate RNA scaffold formed by crRNA and tracrRNA, whereas Cas12a uses only crRNA. Although these RNAs lack sequence similarity, they contain conserved PK structures. PK, pseudoknot; R:AR, repeat-antirepeat. (b) Recognitions of the PK architectures by TnpB, Cas12f, and Cas12a. The PK architectures are recognized by their cognate proteins in similar manners. (c) Schematic of the ωRNA. The crRNA-like region, tracrRNA-like region, and natural tetraloop are colored red, orange, and grey, respectively. The disordered regions are enclosed in dashed boxes.
Extended Data Fig. 8
Extended Data Fig. 8. Schematic of nucleic-acid recognition and target DNA cleavage.
(a) The residues that interact with the nucleic acids through their main chains are shown in parentheses. The disordered regions are indicated by dashed gray lines. (b) Effects of mismatches between the ωRNA guide sequence and the target DNA on TnpB-mediated DNA cleavage. The 3-kb linearized target DNA, containing a 16-nt target sequence or 2-nt mismatches at positions 1–16, was incubated with the TnpB–ωRNA complex (250 nM) at 50 °C for 30 min. The reaction products were resolved, visualized, and quantified with a MultiNA microchip electrophoresis device (SHIMADZU). Data are mean ± s.d. (n= 3, biologically independent samples). The experiments were repeated three times with similar results. Source data are provided as a Source data file. (c) Indel formation induced by TnpB at the on-target (AGBL1 gene) and off-target sites in HEK293FT cells. ON, on-target site; OFF, off-target site. (d) Positions of the active sites and the target DNAs of Cas12a (Cas12a from Francisella novicida) (PDB ID: 6I1K) and Cas12e (Cas12e from Deltaproteobacteria, also known as CasX) (PDB ID: 6NY2). The re-hybridized DNA duplexes are recognized by the TNB domain, thereby facilitating the DNA unwinding and loading into the RuvC active site. Source Data
Extended Data Fig. 9
Extended Data Fig. 9. Comparison of TnpB with diverse type V CRISPR-Cas12 enzymes.
Structural comparison of TnpB with Cas12f (Cas12f from an uncultured archaeon) (PDB ID: 7C7L), Cas12a (Cas12a from Francisella novicida) (PDB ID: 6I1K), and Cas12e (Cas12e from Deltaproteobacteria, also known as CasX) (PDB ID: 6NY2).
Extended Data Fig. 10
Extended Data Fig. 10. Comparison of TnpB with IscB and IsrB, and UPGMA dendrogram of Cas12 enzymes.
(a) Structural comparison of TnpB with IscB (IscB from the human gut metagenome) (PDB ID: 7UTN) and IsrB (IsrB from Desulfovirgula thermocuniculi) (PDB ID: 8DMB). (b) UPGMA dendrogram showing similarities between different families of Type V effectors. The dendrogram was built using the UPGMA (unweighted pair group method with arithmetic mean) method and is based on the matrix of HHalign scores calculated for all against all pairwise alignments, with length coverage of at least 33%. The alignments for the respective families were taken from a previous report, except for the Cas12m family for which an updated alignment (104 proteins) was used. The striped rectangle corresponds to the tree depth D between 1.5 and 2 (D = 2 roughly corresponds to the pairwise HHsearch similarity score of exp(2D) ≈ 0.02 relative to the self-score), and reflects the tree depth where the subtype assignment is uncertain and a subject for additional consideration.

References

    1. Makarova KS, et al. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants. Nat. Rev. Microbiol. 2020;18:67–83. doi: 10.1038/s41579-019-0299-x. - DOI - PMC - PubMed
    1. Altae-Tran H, et al. The widespread IS200/IS605 transposon family encodes diverse programmable RNA-guided endonucleases. Science. 2021;374:57–65. doi: 10.1126/science.abj6856. - DOI - PMC - PubMed
    1. Karvelis T, et al. Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease. Nature. 2021;599:692–696. doi: 10.1038/s41586-021-04058-1. - DOI - PMC - PubMed
    1. Hille F, et al. The biology of CRISPR–Cas: backward and forward. Cell. 2018;172:1239–1259. doi: 10.1016/j.cell.2017.11.032. - DOI - PubMed
    1. Gasiunas G, Barrangou R, Horvath P, Siksnys V. Cas9–crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc. Natl Acad. Sci. USA. 2012;109:2579–2586. doi: 10.1073/pnas.1208507109. - DOI - PMC - PubMed

Publication types

MeSH terms