Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 21;21(4):e3002072.
doi: 10.1371/journal.pbio.3002072. eCollection 2023 Apr.

Distribution and molecular evolution of the anti-CRISPR family AcrIF7

Affiliations

Distribution and molecular evolution of the anti-CRISPR family AcrIF7

Wendy Figueroa et al. PLoS Biol. .

Abstract

Anti-clustered regularly interspaced short palindromic repeats (CRISPRs) are proteins capable of blocking CRISPR-Cas systems and typically their genes are located on mobile genetic elements. Since their discovery, numerous anti-CRISPR families have been identified. However, little is known about the distribution and sequence diversity of members within a family, nor how these traits influence the anti-CRISPR's function and evolution. Here, we use AcrIF7 to explore the dissemination and molecular evolution of an anti-CRISPR family. We uncovered 5 subclusters and prevalent anti-CRISPR variants within the group. Remarkably, AcrIF7 homologs display high similarity despite their broad geographical, ecological, and temporal distribution. Although mainly associated with Pseudomonas aeruginosa, AcrIF7 was identified in distinct genetic backgrounds indicating horizontal dissemination, primarily by phages. Using mutagenesis, we recreated variation observed in databases but also extended the sequence diversity of the group. Characterisation of the variants identified residues key for the anti-CRISPR function and other contributing to its mutational tolerance. Moreover, molecular docking revealed that variants with affected function lose key interactions with its CRISPR-Cas target. Analysis of publicly available data and the generated variants suggests that the dominant AcrIF7 variant corresponds to the minimal and optimal anti-CRISPR selected in the family. Our study provides a blueprint to investigate the molecular evolution of anti-CRISPR families.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Location of the anti-CRISPR gene g2 in the genome of phage H70 and inhibition of the CRISPR-Cas system I-F.
(A) The map represents a region of the H70 phage genome (orfs 14 to 37 shown as arrows). The grey arrows correspond to core genes conserved in the phage group D3112virus, whereas the white arrows represent accessory orfs [20]. The anti-CRISPR locus, encoding the anti-CRISPR gene (g2) and a putative DNA-binding gene (g9), are shown in green. (B) Serial dilutions of different CRISPR-sensitive phages (indicated above the figure) were spotted on bacterial lawns of the PA14, PA14 ΔCRISPR-cas (PA14 ΔCR), PA14-pUCP24-L3, and PA14-pUCP24-L3(g2) strains. Phage infection (shown as plaques) denotes a lack of CRISPR-Cas defence due to either the absence of the CRISPR-Cas system (PA14 ΔCR) or anti-CRISPR activity (PA14-pUCP24-L3(g2)). Note that the titre of each phage stock was different, and therefore, not comparable between phages. CRISPR, clustered regularly interspaced short palindromic repeats.
Fig 2
Fig 2. Diversity of members of the anti-CRISPR family AcrIF7.
(A) Neighbour-joining unrooted tree displaying the patterns of sequence similarity among protein sequences homologous to G2 of phage H70. Homologous sequences were identified through BLASTp searches against anti-CRISPRdb [12] and proteomes of Pseudomonas phages and P. aeruginosa genomes from GenBank (see Methods). The 145 amino acid sequences presented in the tree were aligned with PRALINE [25]. The tree was inferred from the resulting alignment with Seaview v4.6 [26] (BioNJ method). Grey dots on tree branches represent bootstrap support values >80 calculated from 1,000 replicates. Subclusters (sc) identified in the tree are indicated in orange along with the number of sequences in them. The 25 nonredundant sequences from anti-CRISPRdb are labelled with their corresponding identifier in the database (“anti_CRISPR” prefix). The remaining sequence labels indicate the GenBank assembly identifier and protein accession number separated by a hyphen (“-”) except for the sequence corresponding to G2. Asterisks mark sequences identified in phage genomes, whereas purple dots pinpoint sequences that have been experimentally verified as an anti-CRISPR. Labels in blue denote nonredundant sequences within their corresponding subcluster (excluding those in sc5) and thus represent the diversity of protein sequences in the tree. Dotted line circles indicate sequences identified in non-P. aeruginosa genomes. Notes on the hybrid nature of the sequence in sc3, and the lack of identifiable anti-CRISPR activity against the systems I-F and I-E of P. aeruginosa in a homolog (accession: WP_034755374.1) of sc5, correspond to references [27] and [21]. (B–F) Metadata associated with the 117 P. aeruginosa genomes encoding an AcrIF7 homolog. Data plotted in panels B–D (bacterial strains source, country and year of isolation, respectively) were extracted from the genomes BioSample record (S1 Data). Sequence types (ST) presented in panel E were identified using the pubMLST P. aeruginosa scheme ([28], http://pubmlst.org/paeruginosa) and the mlst tool v.2.8 ([29], https://github.com/tseemann/mlst). The occurrence of CRISPR-Cas systems in the AcrIF7-carrier genomes, displayed in panel F, was assessed with cctyper v1.4.4 [30]. CRISPR, clustered regularly interspaced short palindromic repeats; MLST, multilocus sequence typing.
Fig 3
Fig 3. Alignment of nonredundant protein sequences of the AcrIF7 family.
The 24 nonredundant protein sequences selected as representative of the diversity observed among AcrIF7 homologs (excluding sc5; see Methods and sequence labels in blue in Fig 2) were aligned with PRALINE [25]. The resulting alignment was visualised with Jalview v2.11.1.4 [31]. Identifiers of the homologous variants, shown on the left side of the alignment, correspond to those described in Fig 2. The subcluster to which the variant belongs is indicated next to its identifier. The length of each variant sequence is displayed on the right side of the alignment, next to the bar plot illustrating the number of observations of the different variants among the genomes where a G2 homolog was identified (see Fig 2, Table B in S2 Data). Residues in the alignment are colour coded based on their level of conservation in a given position and the residue type they belong to according to the ClustalX shading scheme, indicated at the bottom-right of the figure. The conservation level and consensus sequence of the alignment are represented with a bar plot and sequence logo at the bottom of the figure, respectively. Residues identified in this study as important for the G2 anti-CRISPR activity are pinpointed with solid arrows or a dotted line below the alignment. The dotted line indicates that the lack of the underscored residues in G2 nullifies the anti-CRISPR activity of the protein (see Fig 5). Residues important for the AcrIF7-CRISPR-Cas interaction as reported by Kim and colleagues [18] are identified with open arrows. Red arrows denote residues on which mutations drive the loss of the AcrIF7 function or interaction, whereas orange arrows indicate residues on which mutations have a partial effect. CRISPR, clustered regularly interspaced short palindromic repeats.
Fig 4
Fig 4. Comparative analysis of genome regions harbouring AcrIF7 homologs.
The figure shows a similarity network (A) and pairwise comparisons (B) at the nucleotide level of regions containing an AcrIF7 homolog in P. aeruginosa and phage genomes. (A) One hundred nine regions containing an AcrIF7 homolog gene plus at least 10 kb of flanking sequence (where available; see Methods) were extracted and compared all-vs-all with mash. Regions were then clustered based on mutation distance and visualised in a network to determine their diversity and frequency among the set of analysed genomes. Connected components clustering identified the clusters (NC) and singletons (NS) in the network. GC content and size of the compared region are indicated below the network. (B) Regions from complete genomes selected for pairwise comparison are paired with their closest match. Only 5 kb of each flanking side of the regions are shown. The organism name, GenBank assembly accession and Network Cluster or Singleton (in parenthesis) of the regions are indicated next to their corresponding gene maps. Instances where more than one AcrIF7 homolog was detected in the same genome are distinguished with a suffix letter added to the GenBank accession number. The AcrIF7 homolog genes and aca1 are colour coded as indicated in the figure. The subcluster type of the different AcrIF7 homolog genes (see Fig 2) is shown above the corresponding gene arrow. Where available, functions assigned to ORF products, as indicated in the GenBank file, are displayed above the corresponding arrow. Asterisks mark functions assigned as putative. Light yellow arrows denote ORFs encoding homologs of Aca1 overlooked in the original annotations. White arrows indicate overlooked ORFs with unknown functions. The percentage of sequence identity detected between homologous regions, depicted as grey connecting blocks, is indicated next to the corresponding block. For homologous regions containing an AcrIF7 homolog gene, the percentage of identity between the gene sequences is additionally indicated in parenthesis.
Fig 5
Fig 5. Impact of genetic variation on AcrIF7 function and structure.
(A) Efficiency of G2 mutants at inhibiting the CRISPR-Cas system I-F. The lollipop charts show the EOP of the CRISPR-sensitive phage JBD18 on PA14 carrying different variants of G2, normalised to the titre of the same phage in PA14 harbouring G2 WT. Asterisks denote adjusted p-values ≤0.05 (raw data of replicates and p-values can be found in S5 Data). (B) Mutational map of G2 variants generated by random mutagenesis. The colours represent the phenotype: wild-type (in blue), partial loss-of-function (in yellow), or null activity (in red). The changes in each mutant are shown next to the map (e.g., Mut-A1 has a mutation in F19S, whereas Mut-A3 has mutations in F4S, D29E, and V45D). The WT sequence of G2 is displayed at the top of the panel, with each mutated position coloured according to the phenotype of the mutant that carried changes in that position. Below the sequence, a heatmap is shown representing the dN/dS value for each of the amino acids in G2 (S6 Data). Black indicated a strong negative selection, whereas green symbolises a neutral selection. Rectangles with a bold black contour indicate that those specific mutations (both position and amino acid change) were found in sequences in the databases. (C) AlphaFold2 prediction of G2 structure showing residues with neutral mutations (in blue) or loss-of-function mutations (in yellow or red). Protein model prediction for the mutants mutA/mutC12 lacking 13 amino acids in the C-terminus. Amino acids in pink correspond to loss-of-function mutations; the figure shows the displacement of Y32 in the structure of the mutant, while V45 and V40 remain in the same predicted position as G2 WT. (D) Mutational map of G2 variants generated by site-directed mutagenesis. The figure shows the mutations recreated based on the results of the random mutagenesis and the AlphaFold structures. CRISPR, clustered regularly interspaced short palindromic repeats; EOP, efficiency of plating; WT, wild-type.
Fig 6
Fig 6. Residue-residue interactions of AcrIF7 variants with the Cascade complex.
The interaction of the AlphaFold model of G2 and Cas8f is shown in the left panel. In the right panel, the matrix shows the residue–residue interactions between the AcrIF7 mutants and Cas8f. At the top, the residues of Cas8f that interact with AcrIF7 are displayed, whereas the panel on the left represents each of the AcrIF7 variants assessed in the docking analysis. The numbers inside the squared denote the interacting AcrIF7 residue(s); e.g., residue R24 of Cas8f interacts with the residue E50 from the long AcrIF7 (AcrIF7L), and E34 from MutA1 and MutA7, whereas it does not interact with any residues of G2 WT. The colour of the squares reflects the anti-CRISPR activity of the variant: blue for wild-type, yellow for partial loss of function, and red for null mutants. The proteins used for the analysis were AcrIF7L (7JZX) [36], AcrIF7S (6M3N) [18], and G2 and with all the variants we generated (AlphaFold model). CRISPR, clustered regularly interspaced short palindromic repeats.
Fig 7
Fig 7. Experimental evolution of H70 in PA14 WT and PA14 ΔCR.
Panel (A) illustrates the evolution of the EOP throughout the passages in either PA14 WT (top) or PA14 ΔCR (bottom). Each coloured shape represents a different lineage (biological replicates) with 3 technical replicates each. No statistically significant difference in the EOP was found in one-way ANOVA tests. Panel (B) shows the unique variants (only present in one of the populations) in the H70 genome (from the last passage) that differ from the reference (NC_027384.1) and comprise more than 1% of the population. Panel (C) represents the variants found in the RGP G that is composed of the genes acrIF7 (g2) and aca1 (g9), and their respective intergenic regions. EOP, efficiency of plating; RGP, region of genomic plasticity; WT, wild-type.

Similar articles

Cited by

References

    1. Doron S, Melamed S, Ofir G, Leavitt A, Lopatina A, Keren M, et al.. Systematic discovery of antiphage defense systems in the microbial pangenome. Science. 2018:359. doi: 10.1126/science.aar4120 - DOI - PMC - PubMed
    1. Mojica FJM, Díez-Villaseñor C, García-Martínez J, Soria E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J Mol Evol. 2005;60:174–182. doi: 10.1007/s00239-004-0046-3 - DOI - PubMed
    1. Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, Moineau S, et al.. CRISPR provides acquired resistance against viruses in prokaryotes. Science. 2007;315:1709–1712. doi: 10.1126/science.1138140 - DOI - PubMed
    1. Brouns SJJ, Jore MM, Lundgren M, Westra ER, Slijkhuis RJH, Snijders APL, et al.. Small CRISPR RNAs Guide Antiviral Defense in Prokaryotes. Science. 2008:960–964. doi: 10.1126/science.1159689 - DOI - PMC - PubMed
    1. Marraffini LA, Sontheimer EJ. CRISPR interference: RNA-directed adaptive immunity in bacteria and archaea. Nat Rev Genet. 2010;11:181–190. doi: 10.1038/nrg2749 - DOI - PMC - PubMed

Publication types