Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Feb 26;351(6276):aad4234.
doi: 10.1126/science.aad4234.

Direct CRISPR spacer acquisition from RNA by a natural reverse transcriptase-Cas1 fusion protein

Affiliations

Direct CRISPR spacer acquisition from RNA by a natural reverse transcriptase-Cas1 fusion protein

Sukrit Silas et al. Science. .

Abstract

CRISPR systems mediate adaptive immunity in diverse prokaryotes. CRISPR-associated Cas1 and Cas2 proteins have been shown to enable adaptation to new threats in type I and II CRISPR systems by the acquisition of short segments of DNA (spacers) from invasive elements. In several type III CRISPR systems, Cas1 is naturally fused to a reverse transcriptase (RT). In the marine bacterium Marinomonas mediterranea (MMB-1), we showed that a RT-Cas1 fusion protein enables the acquisition of RNA spacers in vivo in a RT-dependent manner. In vitro, the MMB-1 RT-Cas1 and Cas2 proteins catalyze the ligation of RNA segments into the CRISPR array, which is followed by reverse transcription. These observations outline a host-mediated mechanism for reverse information flow from RNA to DNA.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Phylogenetic distribution and domain structure of RT-Cas1 fusions
(A) Taxonomic summary of unique RT-Cas1 protein records obtained from the NCBI CDART engine (current as of 05/2015). Numbers of Cas1 protein records and bacterial species are shown with (left) a fused RT domain; (center) RT and an additional N-terminal extension containing a Cas6-like motif; and (right) Cas1 with no additional annotated domain. Only phyla containing RT-Cas1 fusions are listed. (B) 16S rRNA-based tree showing major bacterial phyla, with RT-Cas1 containing phyla in red (adapted from (47)). (C) Schematic showing the domain organization of HIV RT (P03366), a group II intron RT (TeI4c from Thermosynechococcus elongatus BP-1; WP_011056164), Arthrospira platensis RT-Cas1 (WP_006620498), Marinomonas mediterranea RT-Cas1 (WP_013659858), and E. coli Cas1 (NP_417235). Conserved RT motifs as defined in (48) are labeled 1 to 7. Motifs 0 and 2a are conserved in mobile group II intron and non-LTR-retrotransposon RTs (32). The YxDD sequence found in motif 5 contains two Asp residues at the RT active site. D: DNA binding domain, En: Endonuclease domain. Three α-helices found in the Thumb/X domain of HIV and group II intron RTs are indicated.
Figure 2
Figure 2. Spacer acquisition in E. coli by ectopic expression of MMB-1 Type III-B CRISPR components
(A) The MMB-1 Type III-B CRISPR operon consists of an 8-spacer CRISPR array (CRISPR03), followed by a canonical 6-gene cassette putatively encoding the Type III-B Cmr effector complex, two genes of unknown function (Marme_0671 and Marme_0670), then the RT-Cas1 and Cas2 genes, and finally a larger 58-spacer CRISPR array (CRISPR02). The locus is flanked by two ~200 bp direct repeats (green arrows). (B) Arrangement of MMB-1 Type III-B CRISPR components under inducible promoters on pBAD vectors for ectopic expression in E. coli. (C) Spacer detection frequency after overnight induction of E. coli carrying pBAD expression vectors with arabinose and IPTG. Wild-type RT-Cas1, RT active site mutant (YAAA), and Cas1 domain mutants E790A and E870A were tested with or without the Plac driven Cmr “effector” gene cassette. Cas2 Δ32-* and RT domain Δ299-588 deletion mutants were tested without the Cmr cassette. Where shown, bars indicate values for two biological replicates (n.d.: not determined). (D) Histogram showing normalized counts of E. coli genomic protospacers from the wild-type RTCas1 and RTΔ spacer acquisition experiments, distributed by mappable length. Pooled data from several experiments are presented. (E) Nucleotide probabilities at each position along the wild-type RT-Cas1-acquired protospacers in (D) including 15 bp of flanking sequence on each side. Due to varying protospacer lengths, two panels are shown with spacer 5’ and 3’ ends anchored at positions 15 and 35, respectively. (F) Cumulative normalized distribution of spacers in (D) among E. coli protein-coding ORFs sorted by expression level (normalized RNAseq read counts from (49); FPKM: fragments per kb per million reads), with most highly expressed genes listed first. 2,470 wild-type RT-Cas1, and 5,569 RTΔ-acquired spacers mapping to E. coli genes are included. Dashed black lines show the range of values from a Monte-Carlo simulation with random assortment (no transcription-related bias).
Figure 3
Figure 3. RT-Cas1 mediated spacer acquisition in Marinomonas mediterranea
(A) Arrangement of Marme_0670, RT-Cas1, and Cas2 genes on pKT230 broad-host-range vectors under control of the putative 16S rRNA promoter (100 bp sequence upstream of the M. mediterranea 16S rRNA gene) for over-expression in MMB-1. New spacers were amplified from the genomic CRISPR03 array. (B) Spacer detection frequency after overnight growth of MMB-1 transconjugants carrying pKT230 over-expression vectors. Two clones each from two independent conjugations carrying either wild-type RT-Cas1, Cas1 domain mutants E790A or E870A, RT domain Δ299-588 deletion mutants, or an empty pKT230 vector were tested. Bars depict spacer acquisition frequencies for two transconjugants. (C) Histogram showing normalized counts of MMB-1 genomic protospacers from the wild-type RT-Cas1 and RTΔ spacer acquisition experiments, distributed by mappable length. Pooled data from several experiments are presented. (D) Nucleotide probabilities at each position along the wild-type RT-Cas1-acquired protospacers in (C) including 15 bp of flanking sequence on each side. Due to varying protospacer lengths, two panels are shown with spacer 5’ and 3’ ends anchored at positions 15 and 35, respectively. (E) Cumulative distribution of spacers in (C) among MMB-1 genes sorted by RNAseq FPKM, with most highly expressed genes listed first. 455 wild-type RT-Cas1, and 341 RTΔ-acquired spacers mapping to MMB-1 genes are included. Guides are drawn along the x-axis at top 10% and top 50% genes by expression level. Monte Carlo bounds were calculated as in Figure 2F. rRNA genes have been excluded from this analysis as spacers were rarely acquired from rRNA.
Figure 4
Figure 4. Spacer acquisition from RNA in the MMB-1 Type III-B system
(A) Spacers acquired from a host genome could conceivably originate from either RNA or DNA. To test for an RNA origin, we used an engineered self-splicing transcript, which produces an RNA sequence junction that is not encoded by DNA. Bases that were mutated to provide flanking exon sequences favorable for td intron splicing are separated by the 393 bp intron in the DNA template. Following transcription and splicing, the two exons are brought together to form a novel junction containing the “identifying mutations”. Newly acquired spacers that contain this exon-junction indicate spacer acquisition from an RNA target. (B) Alignments of some of the genome-contiguous spacers (gray) and several newly acquired exon-junction spanning spacers (red) to the genomic and split-gene sequences, respectively. Bases mutated to facilitate td intron splicing are underlined in the genomic sequences. Identifying mutations are depicted as colored bases, and the splice sites are indicated by green triangles. The highlighted ssrA exon-junction spanning spacer (bottom) is antisense to the spliced tmRNA and differs from a putative DNA template by the 5 expected mutations. (C) All unique spacers spanning the td-intron splice site that did not carry the engineered mutations. The maximum number of mismatches when these spacers were mapped to the wild-type genomic locus is indicated. None of the identifying mutations were observed among these sporadic mismatches. The spacers in (B) were in addition to four spacers (1 for the S15 and 3 for the ssrA construct) that align to the unspliced exon-intron junction and could have been derived from either DNA or (nascent) RNA.
Figure 5
Figure 5. Site-specific CRISPR DNA cleavage/ligation by RT-Cas1/Cas2
(A) Schematic of CRISPR DNA substrates and products of cleavage/ligation reactions. The substrate was a 268 bp DNA containing the leader (gray), the first two repeats (R1 and R2; orange) and spacers (S1 and S2; green), and part of the third repeat (orange) of the MMB-1 CRISPR03 array. Cleavages (arrowheads) occur at the boundaries of the first repeat with concomitant ligation of a DNA or RNA oligonucleotide (blue) to the 3’ fragment, yielding products of the sizes shown. (B) Internally labeled CRISPR DNA and a 33-nt dsDNA were incubated with no protein (None, lane 1), RT-Cas1 (lane 2), Cas2 (lane 3), or a 1:2 mixture of RT-Cas1 and Cas2 (lane 4). The sizes of products determined from sequencing ladders in parallel lanes are indicated (left). (C) Internally labeled CRISPR DNA was incubated with WT RT-Cas1 and Cas2 without (lane 1) or with a 21-nt RNA (lane 2), 35-nt RNA (lane 3), or 29-nt ssDNA (lane 4). (D) Internally labeled CRISPR DNA was incubated with WT RT-Cas1+Cas2 in the absence (none), or presence of a 29-nt ssDNA with either a 3’ OH (lane 2) or a 3’ phosphate (lane 3). (E) Nuclease digestion of 5’ end-labeled RNA and DNA oligonucleotides ligated to CRISPR DNA. Ligation reactions were done as in (C). After extraction with phenol-CIA and ethanol precipitation, the products were incubated with the indicated nucleases. An asterisk indicates that the sample was boiled to denature the DNA before adding the nuclease. (F) Ligation of 5’ end-labeled RNA and DNA oligonucleotides into CRISPR DNA by WT and mutant RT-Cas1 proteins. Lanes 1 and 6 show control reactions of internally labeled CRISPR with WT RT-Cas1+Cas2 and an unlabeled 35-nt ssRNA or 29-nt ssDNA oligonucleotide for comparison. Lanes 2-5 and 7-10 show reactions of unlabeled CRISPR DNA with 5’-end labeled 35-nt ssRNA and 29-nt ssDNA, respectively, and WT, E870A, and RTΔ RT-Cas1 plus Cas2. All reactions were done in the presence of dNTPs. (G) Effect of dNTPs. In the gel to the left, internally labeled CRISPR DNA was incubated with WT RT-Cas1 plus Cas2 in the presence of a 29-nt ssDNA (lanes 1 and 2) or 35-nt ssRNA (lanes 3 and 4) in the absence (lanes 1 and 3) or presence of 1 mM dNTPs (1 mM each of dATP, dCTP, dGTP, and dTTP; lanes 2 and 4). In the gel to the right, internally labeled CRISPR DNA was incubated with WT RT-Cas1+Cas2 in the presence of a 35-nt ssRNA oligonucleotide in the absence (none, lane 10) or presence of different dNTPs (1 mM) as indicated (lanes 5 to 9). Red and black dots indicate products resulting from cleavage and ligation of oligonucleotides at the junction of the leader and first repeat on the top strand and the junction of repeat 1 and spacer 1 on the bottom strand, respectively; cyan and purple dots indicate products of the size expected for cleavage and ligation of the oligonucleotide at the junctions of the second CRISPR repeat (see Fig. S10).
Figure 6
Figure 6. cDNA synthesis using RNA ligated to CRISPR DNA
(A) Schematic shows the CRISPR DNA substrate (leader, gray; repeat, orange; spacer, green) and the expected products of cleavage/ligation (top) followed by TPRT of the ligated RNA oligonucleotide (blue). cDNAs are shown as black dashes with arrowheads indicating the direction of cDNA synthesis. (B) Wild-type (WT) or mutant RT-Cas1 proteins plus Cas2 were incubated with 268 bp CRISPR DNA in the presence of 21-nt RNA oligonucleotide, labeled dCTP and unlabeled dATP, dGTP, and dTTP. The wild-type RT-Cas1+Cas2 complex yields labeled bands of the sizes expected (148 and 155 nt+oligo) for target DNA-primed reverse transcription (TPRT) of the RNA oligonucleotide ligated site-specifically at opposite boundaries of the first CRISPR DNA repeat (R1, lane 8). The labeled products were not detected with the RT domain (RTΔ; lane 9) or Cas1 active site (E870A, lane 10) mutants, but a background of labeled products is seen in the E870A lane due to the RT activity of the protein in the absence of cleavage and ligation (see Fig. S8). Labeled products were not detected in the absence of the RNA oligonucleotide (lanes 3 to 6) or CRISPR DNA (lanes 11 and 12). Separate lanes from the same gel (lanes 1 and 2) show the positions of cleavage-ligation products for RT-Cas1+Cas2 with internally labeled CRISPR DNA substrate. “None” indicates no protein added.

Comment in

  • RNA. CRISPR goes retro.
    Sontheimer EJ, Marraffini LA. Sontheimer EJ, et al. Science. 2016 Feb 26;351(6276):920-1. doi: 10.1126/science.aaf2851. Science. 2016. PMID: 26917756 No abstract available.
  • Microbial genetics: CRISPR memories of RNA.
    Waldron D. Waldron D. Nat Rev Genet. 2016 Apr;17(4):192-3. doi: 10.1038/nrg.2016.31. Epub 2016 Mar 7. Nat Rev Genet. 2016. PMID: 26948816 No abstract available.

References

    1. Barrangou R, et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science. 2007;315:1709–1712. - PubMed
    1. Marraffini LA, Sontheimer EJ. CRISPR interference: RNA-directed adaptive immunity in bacteria and archaea. Nat Rev Genet. 2010;11:181–190. - PMC - PubMed
    1. Bolotin A, Quinquis B, Sorokin A, Ehrlich SD. Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin. Microbiology. 2005;151:2551–2561. - PubMed
    1. Mojica FJ, Diez-Villasenor C, Garcia-Martinez J, Soria E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J Mol Evol. 2005;60:174–182. - PubMed
    1. Pourcel C, Salvignol G, Vergnaud G. CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies. Microbiology. 2005;151:653–663. - PubMed

Publication types

MeSH terms

Associated data