Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Dec;10(12):3346-3361.
doi: 10.1038/s41564-025-02180-8. Epub 2025 Nov 6.

An updated evolutionary classification of CRISPR-Cas systems including rare variants

Affiliations

An updated evolutionary classification of CRISPR-Cas systems including rare variants

Kira S Makarova et al. Nat Microbiol. 2025 Dec.

Abstract

The known diversity of CRISPR-Cas systems continues to expand. To encompass new discoveries, here we present an updated evolutionary classification of CRISPR-Cas systems. The updated CRISPR-Cas classification includes 2 classes, 7 types and 46 subtypes, compared with the 6 types and 33 subtypes in our previous survey 5 years ago. In addition, a classification of the cyclic oligoadenylate-dependent signalling pathway in type III systems is presented. We also discuss recently characterized alternative CRISPR-Cas functionalities, notably, type IV variants that cleave the target DNA and type V variants that inhibit the target replication without cleavage. Analysis of the abundance of CRISPR-Cas variants in genomes and metagenomes shows that the previously defined systems are relatively common, whereas the more recently characterized variants are comparatively rare. These low abundance variants comprise the long tail of the CRISPR-Cas distribution in prokaryotes and their viruses, and remain to be characterized experimentally.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Modular organization of CRISPR–Cas systems.
In class 1 CRISPR–Cas systems, effector modules consist of multiple Cas proteins that form a crRNA-binding complex and function together in target binding and cleavage. Class 2 systems have a single multidomain crRNA-binding protein that is functionally analogous to the entire effector complex of class 1. Subtype III-E is an exception within class 1, with a single effector protein composed of several domains derived from type III-D systems. The schematic shows the typical relationships between genetic, structural and functional organization for the seven types of CRISPR–Cas systems. Protein names follow the current nomenclature. Dispensable (and/or missing in some subtypes and variants) components are indicated by dashed outlines. Cas6 is shown with a thin solid outline for type I because it is dispensable in some but essential in most systems and with a dashed line for type III because most of these apparently use the Cas6 protein provided in trans by other CRISPR–cas loci. New type VII has a unique effector protein (Cas14) composed of two domains: β-CASP family RNase fused to a domain homologous to the C terminal of Cas10. The three colours for Cas9, Cas10, Cas12 and Cas13 each reflect the fact that these proteins contribute to different stages of the CRISPR–Cas activity. The CARF and HEPN domains often fused in a single protein are the most common sensors and effectors, respectively, in the type III ancillary modules but several alternative sensors and effectors have been identified as well and are discussed in more detail in the main text. RING nucleases (Crn) cleave cyclic oligoA produced by Cas10 and thus control the indiscriminate RNase activity of the HEPN domain of a CARF protein. HRAMP and ARAMP are array-less CRISPR-like systems derived from type III, which are typically associated with different nucleases, most often of HNH and PD-(D/E)xK families. *Putative small subunit that is fused to the large subunit in several type I subtypes. **This function can be performed by unrelated proteins; HD, HNH and PD-(D/E)K, nucleases of the respective superfamilies; LS, large subunit; RT, reverse transcriptase.
Fig. 2
Fig. 2. New class 1 CRISPR–Cas systems.
a, Genetic organization of the recently identified type VII and subtypes III-G, III-H and III-I CRISPR–Cas systems. Homologous domains are colour-coded and identified by a family name. The gene names follow the previous classification. Protein-coding genes are shown roughly to scale; CRISPR arrays are not shown to scale. In subtypes III-G, III-H and III-I, the Palm domains of Cas10 responsible for signalling-molecule synthesis are inactivated. The organism and the corresponding gene range are shown in grey on the right of the loci. If genes are not present in every locus of the respective systems, they are shown by dashed outlines. In the type III-G schematic, the question mark next to Csx26 indicates that the assignment of this protein as a Cas11 counterpart remains hypothetical. b, Schematic representation of CRISPR–Cas effector complexes of type VII and subtypes III-D, III-E, III-G, III-H and III-I. The type VII, subtype III-D1 and III-E schematics are based on the solved structures (Protein Data Bank (PDB) identifiers 8zwl, 8bww and 7y82, respectively). The schematics for subtypes III-G and III-H are based on AlphaFold3 complex models using the following proteins and RNAs: III-G (NC_012623.1, Sulfolobus islandicus Y.N.15.51): 6×Cas7 (most 5′), 1×Cas7 (between csx26 and cas10), 5×Csx26, 1×Cas5, 1×Cas10, crRNA (48 nucleotides (nt)); III-H (RPGO01000026 ‘Candidatus Argoarchaeum ethanivorans’, AEth_01085): 7×Cas7, 1×Cas5, 1×Cas10, crRNA is the same as for III-G; III-I (Desulfonema magnum WP_207677910 and WP_207677911.1): 1×Cas7-11i, 1×Cas10, crRNA is from type III-E of D. magnum (PDB 7zol). AlphaFold3 interface predicted template modelling scores: III-G, 0.52; III-H, 0.59; III-I, 0.75. The black line schematically represents crRNA. Full models in Extended Data Fig. 5. c, Genomic locus organization of type I and type IV variants that acquired a new nuclease domain and/or lost Cas3 helicase-nuclease. Designations are as in a. d, Genomic locus organization of new Tn7 and Mu transposon-associated type I CRISPR–Cas variants. The tns and tni genes encode transposase subunits. Designations are as in a. e, Genomic locus organization of new type III variants. Designations are as in a. PD-(D/E)xK, nuclease of the respective superfamilies; TPR, tetratricopeptide repeats; STAND, signal transduction ATPases with numerous domains; NTD, N-terminal domain; CTD, C-terminal domain.
Fig. 3
Fig. 3. The built-in signalling pathway of type III CRISPR–Cas systems.
a, Phylogenetic tree of Cas10 with information on associated genes involved in the signalling pathway and domain functionality mapped to the branches. Branches are colour-coded according to the key on the right. The circles around the tree show (from the inner to the outer): (1) inactive PALM (iPALM) domain, indicating that the respective Cas10 cannot make a second messenger molecule; (2) the presence of the HD-nuclease domain; (3)–(6) the presence of genes encoding Crf1, Crf2, Crf4 and Crf9 in the respective loci; (7) the presence of a gene encoding the CorA homologue; and (8) the presence of genes encoding SAM–AMP cleaving enzymes. b, Schematic of the signalling pathway. HEPN, PIN and RelE, ribonucleases of the respective families; CorA, divalent cation channel; HTH, helix-turn-helix domain. *In some Crf4 proteins, PD-(D/E)xK nuclease is capable of cleaving both DNA and RNA.
Fig. 4
Fig. 4. New class 2 CRISPR–Cas systems.
a, Schematic of the relationships between type II subtypes based on Cas9 phylogenetic analysis (Extended Data Fig. 6). New subtypes and variants are shown in blue. b, Loci organization of new type II subtypes and variants. c, Schematic of the relationships among new and previously experimentally characterized type V subtypes based on the UPGMA dendrogram (Extended Data Fig. 4b) and information on their function and organization. Y, presence of the feature; p, partial presence of the feature; n, absence of the feature. ‘Collateral’ refers to indiscriminate cleavage of single-stranded RNA or DNA by Cas nucleases. d, Dendrogram for new experimentally characterized subtype V-F variants (Extended Data Fig. 4b) and information on their functionality and organization. e, Loci organization of the new experimentally characterized type V subtypes and variants. f, Schematic of the relationships between type VI subtypes (Extended Data Fig. 3c). New subtypes and variants are shown in blue. g, Loci organizations of new type VI subtypes and variants. In b, e and g, the organism and the corresponding gene range are shown in grey on the right of the loci. Inact, inactivated.
Fig. 5
Fig. 5. Distribution of CRISPR–Cas systems across bacteria and archaea.
a, Weighted fraction of completely sequenced archaeal and bacterial genomes encoding different types of CRISPR–Cas systems (for each taxon, the sum of genome weights across genomes, containing a system of the given type, was divided by the total sum of genome weights within the taxon). b, Heat map of the relative abundances of different subtypes of CRISPR–Cas systems in major clades of archaea and bacteria across completely sequenced genomes. The heat map shows the weighted fraction (scaled from 0 to 1.0) of complete CRISPR–Cas systems of a particular subtype in each of the major archaeal and bacterial phyla. Each occurrence of a CRISPR–Cas system of a given type within a taxon was assigned a weight equal to the weight of the respective genome (details in Methods and ref. ). The sum of the weights of the CRISPR–cas loci of each type was normalized to the sum of the weights across the taxon. c, Relative abundance of ‘old’ (identified in the 2020 classification and data file 7 in ref. ) and ‘new’ (this Analysis) subtypes of CRISPR–Cas systems in completely sequenced prokaryotic genomes and metagenomic datasets. d, Relative abundances of old and new (current work) subtypes of CRISPR–Cas systems in the clustered NR database.
Fig. 6
Fig. 6. Main trajectories and processes in the evolution of CRISPR–Cas systems.
Schematic depiction of the inferred stages of the evolution of the CRISPR–Cas systems and the evolutionary processes involved. Likely ancestral systems (left), diversification of the adaptive immunity systems (middle) and evolution of additional functions of CRISPR–Cas systems including their recruitment by MGEs (right) are illustrated. The arrows of different colours show distinct routes of evolution. IS, insertion sequences.
Extended Data Fig. 1
Extended Data Fig. 1. Updated classification of type I CRISPR–Cas systems.
The figure schematically shows representative (typical) CRISPR–cas loci for each class 1 subtype and selected distinct variants, with the dendrogram on the left showing the likely evolutionary relationships between the types and subtypes. The column on the right indicates the organism and the corresponding gene range. Homologous genes are colour-coded and identified by a family name. The gene names follow the previous classification. The pink shading shows the effector module. The grey shading of different hues shows the two levels of classification, subtypes and variants. Where both a systematic name and a legacy name are commonly used, the legacy name is given under the systematic name. DNA and RNA are the molecules targeted by the CRISPR–Cas systems.
Extended Data Fig. 2
Extended Data Fig. 2. Updated classification of CRISPR types III, IV and VII CRISPR.
Designations are the same as in Extended Data Fig. 1. Additional subunits of effector complexes are shown as grey arrows. Most of the subtype III-B, III-C, III-E, III-F loci as well as IV-B and IV-C loci lack CRISPR arrays and are shown accordingly although for each of the type III subtypes exceptions have been detected. Dashed line leading to type III-E indicates its likely origin from III-D2. Abbreviations: CHAT, protease domain of the caspase family; TPR, Tetratricopeptide repeats; RT, reverse transcriptase. DNA and RNA are the molecules targeted by the systems.
Extended Data Fig. 3
Extended Data Fig. 3. Updated classification of CRISPR types II and VI.
a,b, The figure schematically shows representative (typical) CRISPR–cas loci of each type II (a) and type VI subtype (b) and selected distinct variants, with the dendrogram on the left showing the likely evolutionary relationships between the types and subtypes. The column on the right indicates the organism and the corresponding gene range. Homologous genes are colour-coded and identified by a family name following the previous classification. Where both a systematic name and a legacy name are commonly used, the legacy name is given under the systematic name. The grey shading of different hues shows the two levels of classification, subtypes and variants. The adaptation module genes cas1 and cas2 are present in only a subset of the subtype VI-A and VI-D loci and are accordingly shown by dashed lines. The WYL-domain-encoding genes and csx27 genes are also dispensable and thus shown by dashed lines. Additional genes encoding components of the interference module, such as tracrRNA, are shown. The domains of the effector proteins are colour-coded: RuvC-like nuclease, green; HNH nuclease, yellow; HEPN RNase, purple; transmembrane domains, blue. DNA and RNA are the molecules targeted by the systems. c, Deep relationships among type VI effector families. Profile–profile comparisons were performed and the UPGMA dendrogram was constructed as described in Methods. The tree is based on the most conserved C-terminal HEPN domain alignments only. HHsearch was run with the minimum length coverage for hits set to l = 0.33, -u = 2.3 -gcut = 0.667. Multiple alignments (profiles) of the C-terminal HEPN domains are available in data file 5 in ref. , and the original tree is available in data file 4 in ref. . The dashed line corresponds to the tree depth D between 1.5 and 2 (D = 2 roughly corresponds to the pairwise HHsearch similarity score of exp(2D) ≈ 0.02 relative to the self-score) and separates most of the subtypes assigned previously or in this work. New subtypes are highlighted by blue. d, Organization of representative loci for distinct variants of subtype VI-B. Designations are the same as in b.
Extended Data Fig. 4
Extended Data Fig. 4. Updated classification of type V CRISPR–Cas systems.
a, Schematics of organization of type II CRISPR–Cas systems. The figure schematically shows representative (typical) CRISPR–cas loci of each experimentally characterized and/or described in previous classification type V subtypes and distinct variants, with the dendrogram on the left showing the likely evolutionary relationships between the types and subtypes. The column on the right indicates the organism and the corresponding gene range. Homologous genes are colour-coded and identified by a family name following the previous classification. Where both a systematic name and a legacy name are commonly used, the legacy name is given under the systematic name. The grey shading of different hues shows the two levels of classification, subtypes and variants. The adaptation module genes cas1 and cas2 are present in only a subset of the type V subtypes. Dispensable (and/or missing in some subtypes and variants) components are indicated by dashed outlines. Additional genes encoding components of the interference module, such as tracrRNA, are shown. The domains of the effector proteins are colour-coded: RuvC-like nuclease (RuvC motifs I, II, III), green. b, Deep relationships between type V effector families. Profile–profile comparisons (RuvC-domain only) were performed and the UPGMA dendrogram was constructed as described in Methods. Specifically, HHsearch was run with the minimum length coverage for hits set to 0.033, -u = 2.3 -gcut = 0.667. Multiple alignments (profiles) used for this analysis are available in data file 5 in Ref. and the original tree is available in data file 4 in ref. . The dashed line corresponds to the tree depth D between 1.5 and 2 (D = 2 roughly corresponds to the pairwise HHsearch similarity score of exp(2D) ≈ 0.02 relative to the self-score) and separates most of the subtypes that were previously assigned previously or in this work. New subtypes are highlighted by blue colour. IS605 (magenta) stands for TnpB RNA-guided nuclease encoded by IS605 family transposons and not associated with CRISPR arrays.
Extended Data Fig. 5
Extended Data Fig. 5
Alphafold 3 models for III-G, III-H, III-I CRISPR effector complexes compared with solved structures of III-D1, III-E and VII effector complexes. The models are the same as schematically shown in Fig. 2b. Distinct subunits are coloured as in Fig. 2b.
Extended Data Fig. 6
Extended Data Fig. 6. CRISPR–Cas subtype III-I.
a, Comparison of representative typical III-D2, III-E and III-I loci organization. Cas7 and Cas11 domains within large multidomain proteins are shown by boxes within the respective arrows. b, Dendrograms for individual Cas7 domains. The larger tree is a hybrid UPGMA/FastTree tree built for all best hits obtained by PSI-BLAST search with the three individual Cas7 domain of Cas7-11i used as queries. The smaller UPGMA dendrogram was built using a matrix of Pairwise rmsd scores as obtained by DALI comparison for individual Cas7 domain from the Cas7-11i AF3 model, Cas7-11e structure (7zol), AF3 model for GwCas7*3 28 and the Cas7 domain from the III-D effector complex structure (8bmw). Underneath the trees, a scheme of similarity between Cas7 domain of Cas7-11i and Cas7-11e are shown. Solid line indicates Cas7 domains with structural similarity and dashed line shows domains with significant sequences similarity. c, AF3 model for Cas7-11i (WP_207677910.1) and Cas10i (WP_207677911.1) complexed with crRNA from subtype III-D. Cas7 and putative Cas11 domain are coloured, Cas10i is shown in blue. d, Structural alignment of AF2 model of Cas10i and Cas10d structure (8s9t-C). Below the DALI structure-guided alignment, the alignment of the catalytic motif (GGDD) of Cas10 and the corresponding region of Cas10i, demonstrating the disruption of the catalytic site in the latter.
Extended Data Fig. 7
Extended Data Fig. 7. Inactivated type V-B variant Cas12b3.
a, Genetic organization of Actinomyces sp. conjugative plasmid region encoding type IV-B and the B3 variant of subtype V-B. Plasmid-related genes are shown in brown. Other genes (black) are mostly uncharacterized or unrelated to CRISPR or known plasmid genes. b, Schematic representation of HHpred search results and substitution of key amino acids of the RuvC-I and RuvC-II sites in Cas12b3. c, Multiple alignment of mini CRISPR array associated with the V-B3 variant. CRISPR repeats are shown in red. Genome names and accession numbers are indicated above the alignment.
Extended Data Fig. 8
Extended Data Fig. 8. Hypothetical scenario for the origins and evolution of CRISPR–Cas systems.
The figure is an amended version of Fig. 6 from the 2020 classification of CRISPR–Cas systems. The key evolutionary events are described to the right of the images. The multiforking arrows denote events that have been inferred to have occurred on multiple, independent occasions during the evolution of CRISPR–Cas systems. Additional abbreviations: “GGDD”, key catalytic motif of the cyclase/polymerase domain of Cas10 that is involved in the synthesis of cOA; TR, terminal repeats.
Extended Data Fig. 9
Extended Data Fig. 9. Example of Cas9 shuffling in situ and estimate of the shuffling frequency.
a, Phylogenetic analysis of subtype II-C genes from Flavobacteriales. Cas1 phylogenetic tree is shown on the left and Cas9 tree is shown on the right. Both trees were constructed using FastTree as described in Methods. Species of different genera are shown in different colours. Arrows indicate several outstanding examples of cas9 exchanged in situ (when closely related cas1 genes are associated with distantly related cas9 genes). The loci schematics and percent identity for selected genes are shown. b, Estimated shuffling rate of adaptation versus effector genes.
Extended Data Fig. 10
Extended Data Fig. 10. Deep relationships among structures of large subunits of type I effector c complexes.
a, Relationships between the structures of type I large subunit representatives (Cas8 and Cas10d families). Cas8 and Cas10d correspond to distinct profiles/subfamilies (sdata file 5 and 6 in ref. ), and representatives of each subfamily was modelled with AF2 (ref. ; with max_template_date = 2024-12-12). In addition, resolved structures of Cas8 and Cas10d were retrieved from PDB. Additional domains present in some of the Cas8 and Cas10d proteins (such as Cas5, Cas3 and Cas11) were identified and trimmed off the structure to keep the respective core structures. These core structures were compared all against all using DALI. Pairs without detectable similarity (no Dali z-score reported) were set artificially to a z-score of 0.1. The pairwise DALI z-scores were normalized by the minimum of the self-scores and converted to a distance matrix on the natural log scale. The UPGMA dendrogram was reconstructed from this distance matrix. A depth of ~1 (red dashed line) corresponds to a pairwise z-score of ~7.5–9. Profile IDs are indicated after the vertical bar. b, Profile–profile comparisons for large subunits of type I effector complexes (Cas8 and Cas10d families) were performed and the UPGMA dendrogram was constructed as described in Methods. HHsearch was run with the minimum length coverage for hits set to l = 0.33, -u = 2.3 -gcut = 0.667. Multiple alignments (profiles) used for this analysis are available in data files 5 and 6 in ref. . The dashed line corresponds to the tree depth D = 2, roughly corresponding to the pairwise HHsearch similarity score of exp(2D) ≈ 0.02 relative to the self-score. Typically, this tree depth reflects reliable sequence similarity.

References

    1. Komor, A. C., Badran, A. H. & Liu, D. R. CRISPR-based technologies for the manipulation of eukaryotic genomes. Cell168, 20–36 (2017). - DOI - PMC - PubMed
    1. Shivram, H., Cress, B. F., Knott, G. J. & Doudna, J. A. Controlling and enhancing CRISPR systems. Nat. Chem. Biol.17, 10–19 (2021). - DOI - PMC - PubMed
    1. Mohanraju, P. et al. Diverse evolutionary roots and mechanistic variations of the CRISPR–Cas systems. Science353, aad5147 (2016). - DOI - PubMed
    1. Hille, F. et al. The biology of CRISPR–Cas: backward and forward. Cell172, 1239–1259 (2018). - DOI - PubMed
    1. McGinn, J. & Marraffini, L. A. Molecular mechanisms of CRISPR–Cas spacer acquisition. Nat. Rev. Microbiol.17, 7–12 (2019). - DOI - PubMed

MeSH terms

LinkOut - more resources