Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Feb;18(2):67-83.
doi: 10.1038/s41579-019-0299-x. Epub 2019 Dec 19.

Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants

Affiliations
Review

Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants

Kira S Makarova et al. Nat Rev Microbiol. 2020 Feb.

Abstract

The number and diversity of known CRISPR-Cas systems have substantially increased in recent years. Here, we provide an updated evolutionary classification of CRISPR-Cas systems and cas genes, with an emphasis on the major developments that have occurred since the publication of the latest classification, in 2015. The new classification includes 2 classes, 6 types and 33 subtypes, compared with 5 types and 16 subtypes in 2015. A key development is the ongoing discovery of multiple, novel class 2 CRISPR-Cas systems, which now include 3 types and 17 subtypes. A second major novelty is the discovery of numerous derived CRISPR-Cas variants, often associated with mobile genetic elements that lack the nucleases required for interference. Some of these variants are involved in RNA-guided transposition, whereas others are predicted to perform functions distinct from adaptive immunity that remain to be characterized experimentally. The third highlight is the discovery of numerous families of ancillary CRISPR-linked genes, often implicated in signal transduction. Together, these findings substantially clarify the functional diversity and evolutionary history of CRISPR-Cas.

PubMed Disclaimer

Conflict of interest statement

Competing interests

The authors declare no competing interests.

Figures

Fig. 1 |
Fig. 1 |. Updated classification of class 1 CRISPR–Cas systems.
The figure schematically shows representative (typical) CRISPR–cas loci of each class 1 subtype and of selected distinct variants, with the dendrogram on the left showing the likely evolutionary relationships between the types and subtypes. The column on the right indicates the organism and the corresponding gene range. Homologous genes are colour-coded and identified by a family name. The gene names follow the previous classification. Where both a systematic name and a legacy name are commonly used, the legacy name is given under the systematic name. The small subunit is encoded by csm2, cmr5, cse2, csa5 and several additional families of homologous genes that are collectively denoted cas11. The adaptation module genes cas1 and cas2 are dispensable in subtypes III-A and III-E (dashed lines). Gene regions coloured cream represent the HD nuclease domain; the HD domain in Cas10 is distinct from that in Cas3 and Cas3″. Functionally uncharacterized genes are shown in grey. The tan shading shows the effector module. The grey shading of different hues shows the two levels of classification: subtypes and variants. Most of the subtype III-B, III-C, III-E and III-F loci, as well as IV-B and IV-C loci, lack CRISPR arrays and are shown accordingly, although for each of the type III subtypes exceptions have been detected. CHAT, protease domain of the caspase family; RT, reverse transcriptase; TPR, tetratricopeptide repeat.
Fig. 2 |
Fig. 2 |. Updated classification of class 2 CRISPR–Cas systems.
The figure schematically shows representative (typical) CRISPR–cas loci for each class 2 subtype and for selected distinct variants, with the dendrogram on the left showing the likely evolutionary relationships between the types and subtypes. The column on the right indicates the organism and the corresponding gene range. Homologous genes are colour coded and are identified by a family name following the previous classification. Where both a systematic name and a legacy name are commonly used, the legacy name is given under the systematic name. The grey shading of different hues shows the two levels of classification: subtypes and variants. The adaptation module genes cas1 and cas2 are present in only a subset of the subtype V-D, VI-A and VI-D loci and are accordingly shown by dashed lines. The WYL-domain-encoding genes and csx27 genes are also dispensable and shown by dashed lines. Additional genes encoding components of the interference module, such as transactivating CRISPR RNA (tracrRNA), are shown. The domains of the effector proteins are colour-coded: RuvC-like nuclease, green; HNH nuclease, yellow; higher eukaryotes and prokaryotes nucleotide-binding (HEPN) RNase, purple; transmembrane domains, blue.
Fig. 3 |
Fig. 3 |. Distribution of the six types of CRISPR–Cas system in the major archaeal and bacterial phyla.
The heat map shows the weighted fraction (between 0 and 1.0) of the genomes in each of the major archaeal and bacterial phyla in which CRISPR–Cas systems of the respective type have been detected. Each CRISPR–cas locus of a given type within a taxon was assigned a weight equal to the weight of the respective genome (see the Supplementary Methods for details); additionally, the weights of the genomes that lack CRISPR–Cas loci were collected. The sum of the weights of the CRISPR–cas loci of each type was normalized by the sum total of the weights across the taxon. ‘Partial or unknown’ indicates CRISPR–cas loci that could not be assigned to any of the known types.
Fig. 4 |
Fig. 4 |. Ancillary genes in CRISPR–Cas systems.
The basic molecular machinery of CRISPR–Cas systems consists of the cas core genes. The core genes are often accompanied by diverse ancillary genes that perform additional or regulatory functions. The ancillary genes are typically present only in subsets of the CRISPR–cas loci of the respective types and subtypes and often also occur in other, non-cas genomic contexts. Prediction of the ancillary genes was performed using the ‘CRISPRicity’ protocol, as we previously described,. Operationally, the list of ancillary genes includes families, labelled as ‘associated’ in the profFam.tab column in Supplementary Dataset 2. The numbers of occurrences (counts) of ancillary genes in each unambiguously classified CRISPR–cas locus were averaged across the system subtypes using genome weights, calculated as described in the Supplementary Methods. The occurrence of ancillary genes across the types and subtypes of CRISPR–Cas systems is shown (part a). The vertical axis shows the weighted mean numbers of ancillary genes per locus in different subtypes. The common ancillary genes and their distribution among CRISPR–Cas types and subtypes is also shown (part b). Gene families are denoted with the corresponding profile names (Supplementary Dataset 2). The weighted mean number of ancillary genes per locus in different subtypes is colour coded as per the scale shown at the bottom.
Fig. 5 |
Fig. 5 |. Outline of a complete scenario for the origins and evolution of CRISPR–Cas systems.
The figure depicts a hypothetical scenario of the origin of CRISPR–Cas systems from an ancestral signalling system (possibly an abortive infection defence system (Abi)). This putative ancestral Abi module shares a cyclic oligoA polymerase Palm domain (RNA recognition motif (RRM) fold) with Cas10 and is proposed to function analogously to type III CRISPR–Cas systems. Specifically, cyclic oligoA molecules that are synthesized in response to virus infection bind to the CRISPR-associated Rossmann fold (CARF) domain of the second protein in this system, resulting in activation of the RNase activity of the higher eukaryotes and prokaryotes nucleotide-binding (HEPN) domain, which induces dormancy through indiscriminate RNA cleavage. This putative ancestral Abi module would give rise to the type III-like CRISPR–Cas effector module via duplication of the RRM domain, with subsequent inactivation of one of the copies (the two RRM domains are denoted RRM1 and RRM2). The ancestral class 1 CRISPR–Cas system is inferred to have evolved through the merger of two modules: the adaptation module, including the CRISPR repeats, derived from a casposon, and the type III-like effector module, likely derived from the ancestral Abi system. The subsequent acquisition of the HD nuclease domain by the effector module provided for RNA-guided DNA cleavage. Inactivation of the oligoA polymerase domain in the effector complex, or possibly replacement of Cas10 by an unrelated protein and acquisition of the Cas3 helicase, led to the emergence of type I systems, which lack the cyclic oligoA-dependent signalling pathway and exclusively cleave double-stranded DNA. Class 2 systems of type II and different subtypes of type V appear to have evolved independently by the recruitment of distinct TnpB nucleases that are encoded by IS605-like transposable elements. Type VI likely originated from an RNA-cleaving, HEPN domain-containing abortive infection or toxin–antitoxin system. Some CRISPR–Cas systems, such as type IV and Tn7-linked systems I-F3 and V-K, were subsequently recruited by mobile genetic elements and lost their interference capacity along with the original defence function. The key evolutionary events are described to the right of the images. The typical CRISPR–cas operon organization is shown for each CRISPR–Cas subtype and for selected distinct variants. Homologous genes are colour-coded and identified by a family name following the previous classification. The multiforking arrows denote events that have been inferred to have occurred on multiple, independent occasions during the evolution of CRISPR–Cas systems. GGDD, key catalytic motif of the cyclase or polymerase domain of Cas10 that is involved in the synthesis of cyclic oligoA signalling molecules; HRAMP, haloarchaeal repeat-associated mysterious proteins; TR, terminal repeats; tracrRNA, transactivating CRISPR RNA ; TSD, target site duplication, the likely source of ancestral repeats.

References

    1. Komor AC, Badran AH & Liu DR CRISPR-based technologies for the manipulation of eukaryotic genomes. Cell 168, 20–36 (2017). - PMC - PubMed
    1. Pickar-Oliver A & Gersbach CA The next generation of CRISPR–Cas technologies and applications. Nat. Rev. Mol. Cell Biol 20, 490–507 (2019). - PMC - PubMed
    1. Mohanraju P et al. Diverse evolutionary roots and mechanistic variations of the CRISPR–Cas systems. Science 353, aad5147 (2016). - PubMed
    1. Jackson SA et al. CRISPR–Cas: adapting to change. Science 356, eaal5056 (2017). - PubMed
    1. Barrangou R & Horvath P A decade of discovery: CRISPR functions and applications. Nat. Microbiol 2, 17092 (2017). - PubMed

Publication types

MeSH terms