Review

. 2019 May 13;374(1772):20180087.

doi: 10.1098/rstb.2018.0087.

Origins and evolution of CRISPR-Cas systems

Eugene V Koonin¹, Kira S Makarova¹

Affiliations

PMID: 30905284
PMCID: PMC6452270
DOI: 10.1098/rstb.2018.0087

Review

Origins and evolution of CRISPR-Cas systems

Eugene V Koonin et al. Philos Trans R Soc Lond B Biol Sci. 2019.

. 2019 May 13;374(1772):20180087.

doi: 10.1098/rstb.2018.0087.

Authors

Eugene V Koonin¹, Kira S Makarova¹

Affiliation

¹ National Center for Biotechnology Information, National Library of Medicine , Bethesda, MD 20894 , USA.

PMID: 30905284
PMCID: PMC6452270
DOI: 10.1098/rstb.2018.0087

Abstract

CRISPR-Cas, the bacterial and archaeal adaptive immunity systems, encompass a complex machinery that integrates fragments of foreign nucleic acids, mostly from mobile genetic elements (MGE), into CRISPR arrays embedded in microbial genomes. Transcripts of the inserted segments (spacers) are employed by CRISPR-Cas systems as guide (g)RNAs for recognition and inactivation of the cognate targets. The CRISPR-Cas systems consist of distinct adaptation and effector modules whose evolutionary trajectories appear to be at least partially independent. Comparative genome analysis reveals the origin of the adaptation module from casposons, a distinct type of transposons, which employ a homologue of Cas1 protein, the integrase responsible for the spacer incorporation into CRISPR arrays, as the transposase. The origin of the effector module(s) is far less clear. The CRISPR-Cas systems are partitioned into two classes, class 1 with multisubunit effectors, and class 2 in which the effector consists of a single, large protein. The class 2 effectors originate from nucleases encoded by different MGE, whereas the origin of the class 1 effector complexes remains murky. However, the recent discovery of a signalling pathway built into the type III systems of class 1 might offer a clue, suggesting that type III effector modules could have evolved from a signal transduction system involved in stress-induced programmed cell death. The subsequent evolution of the class 1 effector complexes through serial gene duplication and displacement, primarily of genes for proteins containing RNA recognition motif domains, can be hypothetically reconstructed. In addition to the multiple contributions of MGE to the evolution of CRISPR-Cas, the reverse flow of information is notable, namely, recruitment of minimalist variants of CRISPR-Cas systems by MGE for functions that remain to be elucidated. Here, we attempt a synthesis of the diverse threads that shed light on CRISPR-Cas origins and evolution. This article is part of a discussion meeting issue 'The ecology and evolution of prokaryotic CRISPR-Cas adaptive immune systems'.

Keywords: adaptive immunity; gene shuffling; mobile genetic elements; signalling.

PubMed Disclaimer

Conflict of interest statement

We declare we have no competing interests.

Figures

**Figure 1.**
Class 1 and class 2 CRISPR-Cas systems: key features, modular organization and module shuffling. (a) The general architectures of class 1 (multiprotein effector complexes) and class 2 (single-protein effector complexes) CRISPR-Cas systems. Genes are shown as arrows; homologous genes are shown by the same colour. Gene names follow the current nomenclature and classification [8,13]. (b) The principal building blocks of CRISPR-Cas system types. An asterisk indicates the putative small subunit (SS) that might be fused to the large subunit in several type I subtypes [13]. The # next to the CARF and HEPN domain labels indicates that other unknown sensor and effector domains can be involved in the signalling pathway. Dispensable genes are indicated by a dashed outline. The bottom panel schematically shows module shuffling in CRISPR-*cas* loci. (Online version in colour.)

**Figure 2.**
Key features and general organization of class 1 CRISPR-Cas systems. The figure schematically shows the general organization of a class 1 effector complex. The colouring of the shapes corresponds to the colour code for *cas* genes in figure 1. All proteins of class 1 effector complex that contain an RRM domain are schematically shown in a separate panel. The topology diagram of the RRM fold is schematically shown, with numbers corresponding to the typical order of β-strands in the fold. The Cas6 structure with two RRM fold domains, which are numbered according to the topology diagram, is shown as the ribbon diagram. The structural comparison of cryo-electron microscopy models demonstrates a striking similarity of effector complex organization of types I and III, despite the absence of significant sequence similarity between the corresponding subunits. The Electron Microscopy Data Bank (EMDB) codes are indicated for each structure. crRNA, CRISPR RNA; RAMP, Repeat-Associated Mysterious Protein. (Online version in colour.)

**Figure 3.**
Origin and evolution of class 1 CRISPR-Cas systems. The figure depicts a hypothetical scenario of the origin of class1 CRISPR-Cas from an ancestral signalling system and its subsequent evolution yielding the extant type III and type I systems, as well as reductive evolution that produced type IV systems and minimalist variants of type I system recruited by Tn7 transposons. The key evolutionary events are described to the right of the images. ‘GGDD’, a key catalytic motif of the cyclase/polymerase domain of Cas10; ‘A’, catalytically active RRM domain of a RAMP protein; RE, LE, right and left end, respectively; TR, terminal repeats. (Online version in colour.)

**Figure 4.**
Key features, general organization and domain architectures of class 2 systems. Schematic of the complexes of effector proteins, with the target DNA or RNA, guide RNA and (for type II) tracrRNA shown on the top of the figure, and the domain architectures of the effector proteins depicted underneath. The catalytic residues of the effector nuclease domain and, for Cas12a and Cas13a, the residues shown to be required for pre-crRNA processing are indicated in red. The Protein Data Bank (PDB) codes are included for proteins with solved structures. HTH, helix–turn–helix DNA-binding domain. The tracrRNA, the pre-crRNA processing catalytic sites and the nicking, target strand-cleaving nuclease of the type V effectors are denoted by asterisks to indicate that they are each present only in subsets of the type II and type V effectors. The catalytic amino acid residues of the target-cleaving and pre-crRNA processing nucleases are shown in red. The small blue boxes show the approximate location of pre-crRNA processing nuclease domain. I, II, III are the distinct amino acid motifs that jointly compose the catalytic site of the RuvC-like nuclease. In the motif signature, ‘x’ stands for any amino acid, and ‘..’ indicates that the catalytic residues are separated by a small, variable number of non-conserved residues. Adapted from [8], with permission. (Online version in colour.)

**Figure 5.**
Origin of the class 2 CRISPR-Cas effectors from MGE. The figure depicts a hypothetical scenario of the origin of class 2 CRISPR-Cas from non-autonomous transposons, for type II and type V systems, and from a defence system (toxin–antitoxin module) for type VI systems. IscB and TnpB are the inferred ancestors of the type II (Cas9) and type V (Cas12) effectors, respectively. Inserts that could have contributed to increased specificity and efficiency of the effectors are shown by grey rectangles of variable size. I, II and III are the distinct amino acid motifs that jointly compose the catalytic site of the RuvC-like nuclease. TR, terminal repeats. (Online version in colour.)

**Figure 6.**
‘IN and OUT’: exchange of components between CRISPR-Cas systems and MGE. The figure shows a hypothetical scenario of *cas* gene acquisition by evolving CRISPR-Cas systems from MGE (IN) and by MGE from CRISPR-Cas systems (OUT). Genes are shown by arrows. The colouring corresponds to distinct *cas* genes and is the same as in figure 1. Grey arrows denote any genes that are considered to be unrelated to CRISPR-Cas. Specific acquisition events are shown for class 1 and class 2 systems separately. Arrows indicate the inferred direction of the gene flow. HEPN, RNase of the HEPN (Higher Eukaryotes and Prokaryotes Nucleotide-binding domain) superfamily; RE, LE, right and left end, respectively; RT, reverse transcriptase; TR, terminal repeats. (Online version in colour.)

See this image and copyright information in PMC

References

1. Sorek R, Lawrence CM, Wiedenheft B. 2013. CRISPR-mediated adaptive immune systems in bacteria and archaea. Annu. Rev. Biochem. 82, 237–266. (10.1146/annurev-biochem-072911-172315) - DOI - PubMed
1. Wright AV, Nunez JK, Doudna JA. 2016. Biology and applications of CRISPR systems: harnessing nature's toolbox for genome engineering. Cell 164, 29–44. (10.1016/j.cell.2015.12.035) - DOI - PubMed
1. Komor AC, Badran AH, Liu DR. 2017. CRISPR-based technologies for the manipulation of eukaryotic genomes. Cell 169, 559 (10.1016/j.cell.2017.04.005) - DOI - PubMed
1. Mohanraju P, Makarova KS, Zetsche B, Zhang F, Koonin EV, van der Oost J. 2016. Diverse evolutionary roots and mechanistic variations of the CRISPR-Cas systems. Science 353, aad5147 (10.1126/science.aad5147) - DOI - PubMed
1. Hsu PD, Lander ES, Zhang F. 2014. Development and applications of CRISPR-Cas9 for genome engineering. Cell 157, 1262–1278. (10.1016/j.cell.2014.05.010) - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Origins and evolution of CRISPR-Cas systems

Affiliation

Origins and evolution of CRISPR-Cas systems

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources