Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 May 13;374(1772):20180087.
doi: 10.1098/rstb.2018.0087.

Origins and evolution of CRISPR-Cas systems

Affiliations
Review

Origins and evolution of CRISPR-Cas systems

Eugene V Koonin et al. Philos Trans R Soc Lond B Biol Sci. .

Abstract

CRISPR-Cas, the bacterial and archaeal adaptive immunity systems, encompass a complex machinery that integrates fragments of foreign nucleic acids, mostly from mobile genetic elements (MGE), into CRISPR arrays embedded in microbial genomes. Transcripts of the inserted segments (spacers) are employed by CRISPR-Cas systems as guide (g)RNAs for recognition and inactivation of the cognate targets. The CRISPR-Cas systems consist of distinct adaptation and effector modules whose evolutionary trajectories appear to be at least partially independent. Comparative genome analysis reveals the origin of the adaptation module from casposons, a distinct type of transposons, which employ a homologue of Cas1 protein, the integrase responsible for the spacer incorporation into CRISPR arrays, as the transposase. The origin of the effector module(s) is far less clear. The CRISPR-Cas systems are partitioned into two classes, class 1 with multisubunit effectors, and class 2 in which the effector consists of a single, large protein. The class 2 effectors originate from nucleases encoded by different MGE, whereas the origin of the class 1 effector complexes remains murky. However, the recent discovery of a signalling pathway built into the type III systems of class 1 might offer a clue, suggesting that type III effector modules could have evolved from a signal transduction system involved in stress-induced programmed cell death. The subsequent evolution of the class 1 effector complexes through serial gene duplication and displacement, primarily of genes for proteins containing RNA recognition motif domains, can be hypothetically reconstructed. In addition to the multiple contributions of MGE to the evolution of CRISPR-Cas, the reverse flow of information is notable, namely, recruitment of minimalist variants of CRISPR-Cas systems by MGE for functions that remain to be elucidated. Here, we attempt a synthesis of the diverse threads that shed light on CRISPR-Cas origins and evolution. This article is part of a discussion meeting issue 'The ecology and evolution of prokaryotic CRISPR-Cas adaptive immune systems'.

Keywords: adaptive immunity; gene shuffling; mobile genetic elements; signalling.

PubMed Disclaimer

Conflict of interest statement

We declare we have no competing interests.

Figures

Figure 1.
Figure 1.
Class 1 and class 2 CRISPR-Cas systems: key features, modular organization and module shuffling. (a) The general architectures of class 1 (multiprotein effector complexes) and class 2 (single-protein effector complexes) CRISPR-Cas systems. Genes are shown as arrows; homologous genes are shown by the same colour. Gene names follow the current nomenclature and classification [8,13]. (b) The principal building blocks of CRISPR-Cas system types. An asterisk indicates the putative small subunit (SS) that might be fused to the large subunit in several type I subtypes [13]. The # next to the CARF and HEPN domain labels indicates that other unknown sensor and effector domains can be involved in the signalling pathway. Dispensable genes are indicated by a dashed outline. The bottom panel schematically shows module shuffling in CRISPR-cas loci. (Online version in colour.)
Figure 2.
Figure 2.
Key features and general organization of class 1 CRISPR-Cas systems. The figure schematically shows the general organization of a class 1 effector complex. The colouring of the shapes corresponds to the colour code for cas genes in figure 1. All proteins of class 1 effector complex that contain an RRM domain are schematically shown in a separate panel. The topology diagram of the RRM fold is schematically shown, with numbers corresponding to the typical order of β-strands in the fold. The Cas6 structure with two RRM fold domains, which are numbered according to the topology diagram, is shown as the ribbon diagram. The structural comparison of cryo-electron microscopy models demonstrates a striking similarity of effector complex organization of types I and III, despite the absence of significant sequence similarity between the corresponding subunits. The Electron Microscopy Data Bank (EMDB) codes are indicated for each structure. crRNA, CRISPR RNA; RAMP, Repeat-Associated Mysterious Protein. (Online version in colour.)
Figure 3.
Figure 3.
Origin and evolution of class 1 CRISPR-Cas systems. The figure depicts a hypothetical scenario of the origin of class1 CRISPR-Cas from an ancestral signalling system and its subsequent evolution yielding the extant type III and type I systems, as well as reductive evolution that produced type IV systems and minimalist variants of type I system recruited by Tn7 transposons. The key evolutionary events are described to the right of the images. ‘GGDD’, a key catalytic motif of the cyclase/polymerase domain of Cas10; ‘A’, catalytically active RRM domain of a RAMP protein; RE, LE, right and left end, respectively; TR, terminal repeats. (Online version in colour.)
Figure 4.
Figure 4.
Key features, general organization and domain architectures of class 2 systems. Schematic of the complexes of effector proteins, with the target DNA or RNA, guide RNA and (for type II) tracrRNA shown on the top of the figure, and the domain architectures of the effector proteins depicted underneath. The catalytic residues of the effector nuclease domain and, for Cas12a and Cas13a, the residues shown to be required for pre-crRNA processing are indicated in red. The Protein Data Bank (PDB) codes are included for proteins with solved structures. HTH, helix–turn–helix DNA-binding domain. The tracrRNA, the pre-crRNA processing catalytic sites and the nicking, target strand-cleaving nuclease of the type V effectors are denoted by asterisks to indicate that they are each present only in subsets of the type II and type V effectors. The catalytic amino acid residues of the target-cleaving and pre-crRNA processing nucleases are shown in red. The small blue boxes show the approximate location of pre-crRNA processing nuclease domain. I, II, III are the distinct amino acid motifs that jointly compose the catalytic site of the RuvC-like nuclease. In the motif signature, ‘x’ stands for any amino acid, and ‘..’ indicates that the catalytic residues are separated by a small, variable number of non-conserved residues. Adapted from [8], with permission. (Online version in colour.)
Figure 5.
Figure 5.
Origin of the class 2 CRISPR-Cas effectors from MGE. The figure depicts a hypothetical scenario of the origin of class 2 CRISPR-Cas from non-autonomous transposons, for type II and type V systems, and from a defence system (toxin–antitoxin module) for type VI systems. IscB and TnpB are the inferred ancestors of the type II (Cas9) and type V (Cas12) effectors, respectively. Inserts that could have contributed to increased specificity and efficiency of the effectors are shown by grey rectangles of variable size. I, II and III are the distinct amino acid motifs that jointly compose the catalytic site of the RuvC-like nuclease. TR, terminal repeats. (Online version in colour.)
Figure 6.
Figure 6.
‘IN and OUT’: exchange of components between CRISPR-Cas systems and MGE. The figure shows a hypothetical scenario of cas gene acquisition by evolving CRISPR-Cas systems from MGE (IN) and by MGE from CRISPR-Cas systems (OUT). Genes are shown by arrows. The colouring corresponds to distinct cas genes and is the same as in figure 1. Grey arrows denote any genes that are considered to be unrelated to CRISPR-Cas. Specific acquisition events are shown for class 1 and class 2 systems separately. Arrows indicate the inferred direction of the gene flow. HEPN, RNase of the HEPN (Higher Eukaryotes and Prokaryotes Nucleotide-binding domain) superfamily; RE, LE, right and left end, respectively; RT, reverse transcriptase; TR, terminal repeats. (Online version in colour.)

References

    1. Sorek R, Lawrence CM, Wiedenheft B. 2013. CRISPR-mediated adaptive immune systems in bacteria and archaea. Annu. Rev. Biochem. 82, 237–266. (10.1146/annurev-biochem-072911-172315) - DOI - PubMed
    1. Wright AV, Nunez JK, Doudna JA. 2016. Biology and applications of CRISPR systems: harnessing nature's toolbox for genome engineering. Cell 164, 29–44. (10.1016/j.cell.2015.12.035) - DOI - PubMed
    1. Komor AC, Badran AH, Liu DR. 2017. CRISPR-based technologies for the manipulation of eukaryotic genomes. Cell 169, 559 (10.1016/j.cell.2017.04.005) - DOI - PubMed
    1. Mohanraju P, Makarova KS, Zetsche B, Zhang F, Koonin EV, van der Oost J. 2016. Diverse evolutionary roots and mechanistic variations of the CRISPR-Cas systems. Science 353, aad5147 (10.1126/science.aad5147) - DOI - PubMed
    1. Hsu PD, Lander ES, Zhang F. 2014. Development and applications of CRISPR-Cas9 for genome engineering. Cell 157, 1262–1278. (10.1016/j.cell.2014.05.010) - DOI - PMC - PubMed

Publication types

LinkOut - more resources