Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 28;120(48):e2308224120.
doi: 10.1073/pnas.2308224120. Epub 2023 Nov 20.

Diversity, evolution, and classification of the RNA-guided nucleases TnpB and Cas12

Affiliations

Diversity, evolution, and classification of the RNA-guided nucleases TnpB and Cas12

Han Altae-Tran et al. Proc Natl Acad Sci U S A. .

Abstract

The TnpB proteins are transposon-associated RNA-guided nucleases that are among the most abundant proteins encoded in bacterial and archaeal genomes, but whose functions in the transposon life cycle remain unknown. TnpB appears to be the evolutionary ancestor of Cas12, the effector nuclease of type V CRISPR-Cas systems. We performed a comprehensive census of TnpBs in archaeal and bacterial genomes and constructed a phylogenetic tree on which we mapped various features of these proteins. In multiple branches of the tree, the catalytic site of the TnpB nuclease is rearranged, demonstrating structural and probably biochemical malleability of this enzyme. We identified numerous cases of apparent recruitment of TnpB for other functions of which the most common is the evolution of type V CRISPR-Cas effectors on about 50 independent occasions. In many other cases of more radical exaptation, the catalytic site of the TnpB nuclease is apparently inactivated, suggesting a regulatory function, whereas in others, the activity appears to be retained, indicating that the recruited TnpB functions as a nuclease, for example, as a toxin. These findings demonstrate remarkable evolutionary malleability of the TnpB scaffold and provide extensive opportunities for further exploration of RNA-guided biological systems as well as multiple applications.

Keywords: CRISPR-Cas12; OMEGA-TnpB; classification; diversity; evolution.

PubMed Disclaimer

Conflict of interest statement

Competing interests statement:F.Z. is a scientific advisor and cofounder of Editas Medicine, Beam Therapeutics, Pairwise Plants, Arbor Biotechnologies, and Aera Therapeutics. F.Z. is also a scientific advisor for Octant.

Figures

Fig. 1.
Fig. 1.
Comprehensive phylogenomic analysis of TnpBs and Cas12s. (A) Overview of differences between OMEGA-TnpB and CRISPR-Cas12. (B) Analysis pipeline used to generate protein sequences for phylogenetic analysis. The first bold number (from left to right) shows the size of the nucleotide database while the second bold numbers and beyond show the number of proteins remaining after various filtering steps. (C) Phylogenetic analysis of 6931 TnpB cluster representatives (at 50% sequence identity) using IQ-Tree2. Bootstrap values are shown as a gradient from red to yellow to black. Major clades are shown around the tree along with their designations. Colored lines around the tree indicate association rates with various elements: upstream CRISPR array (U CR), downstream CRISPR array (D CR), Cas1, Cas2, Cas4, Y1 TnpA (Y1), serine recombinase TnpA (SER). Next, mobility or nonmobility as determined by genome copy counts from complete genomes (when available) are shown in black and brown lines. Then, noncanonical catalytic amino acids for RuvC-I, II, and III are shown as orange lines. Last, trimmed protein lengths are shown as gray bars. The outermost ring contains classifications of TnpBs and Cas12s. Systems with conserved substitutions in the RuvC active site are classified as rearranged and marked accordingly. (D) Alignment of TnpB major clade consensus sequences in the RuvC-I, II, III, and ZF regions.
Fig. 2.
Fig. 2.
Transposon associations of TnpB and features of the corresponding genomic loci and proteins. (A) Distribution of TnpBs and Cas12 across archaea, bacteria, viruses, and plasmids. (B) Occurrence of TnpB and Cas12 in various phyla sorted by abundance. Phyla with lower abundance of TnpBs and Cas12s are not shown. (C) Distribution of Cas12 subtypes across archaea, bacteria, viruses, and plasmids. (D) Fraction of redundant loci containing CRISPR arrays (CR), Cas1, Cas2, and Cas4 for various Cas12 subtypes. (E) Median repeat length distributions of various Cas12 subtypes. Box and whisker plots shown; median (white circle), 25th and 75th percentiles(thick vertical black line), interquartile range (thin vertical black line). (F) CRISPR array length distributions for various Cas12 subtypes. Box and whisker plots as in (E). (G) Distribution of CRISPR array spacer matches of various Cas12 subtypes against viruses and plasmids. X axis shown as percentage of matches relative to all unique spacers (at 50% sequence identity). (H) Zoom-in of phylogenetic tree from Fig. 1 focusing on Cas12b evolution from TnpB. (I) Inferred evolution of Cas12b from TnpB, as well as potential evolution of Cas12h from Cas12b. Putative Y1-7 branch ancestor shown on top. (J) AlphaFold2 structural prediction of evolutionary stages of Cas12b evolution superimposed upon the Cas12b guide and DNA. The upper left of each Inset includes the effector protein along with RuvC active site positions (red) and insertions relative to recent common ancestor (orange).
Fig. 3.
Fig. 3.
Catalytic site rearrangements of TnpB. Structural comparisons of various catalytic site rearrangements in TnpBs and Cas12s using AlphaFold2 predictions. RuvC HJ resolvase is from PDB: 1HJR.
Fig. 4.
Fig. 4.
Transposons associated with TnpBs. Genomic loci of TnpBs along with their associated transposons.
Fig. 5.
Fig. 5.
Derived and exapted TnpB systems. (A) Various identified systems with TnpBs recruited for alternative biological functions. (B) Structural comparison of TnpB with the cofolded TnpB + Sigma Factor system. (C) Structural superimposition of TnpB + Sigma Factor with the Sigma28-RNAP complex (PDB: 6PMI). (D) Probable evolutionary scenario of conversion from Cas12f into an inactivated TnpB system associated with RpoE (Sigma), relaxases, MobC, phage integrases. The likely evolution of the associated RNAs is also shown. (E) Structural model of the TnpB from the SpoIIE-TnpB system. A previously unreported lid domain covers the RuvC active site of TnpB. (F) Rearranged catalytic site of the SpoIIE associated TnpB with potentially novel function. (G) Cofolding of SpoIIE with associated TnpB. (H) Structural model of the TnpB from the RHH-TnpB (MazE-TnpB) system. (I) Structural model of the RHH-TnpB complex from the RHH-TnpB system, along with superimposition of the RHH on a related DNA binding protein (PDB: 2MRU).

References

    1. Altae-Tran H., et al. , The widespread IS200/IS605 transposon family encodes diverse programmable RNA-guided endonucleases. Science 374, 57–65 (2021). - PMC - PubMed
    1. Karvelis T., et al. , Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease. Nature 599, 692–696 (2021). - PMC - PubMed
    1. Kapitonov V. V., Makarova K. S., Koonin E. V., ISC, a novel group of bacterial and archaeal DNA transposons that encode Cas9 homologs. J. Bacteriol. 198, 797–807 (2015). - PMC - PubMed
    1. Koonin E. V., Makarova K. S., Origins and evolution of CRISPR-Cas systems. Philos. Trans. R Soc. B Biol. Sci. 374, 20180087 (2019). - PMC - PubMed
    1. Nakagawa R., et al. , Cryo-EM structure of the transposon-associated TnpB enzyme. Nature 616, 390–397 (2023). - PMC - PubMed

Publication types

LinkOut - more resources