Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 2;49(W1):W125-W130.
doi: 10.1093/nar/gkab456.

CRISPRloci: comprehensive and accurate annotation of CRISPR-Cas systems

Affiliations

CRISPRloci: comprehensive and accurate annotation of CRISPR-Cas systems

Omer S Alkhnbashi et al. Nucleic Acids Res. .

Abstract

CRISPR-Cas systems are adaptive immune systems in prokaryotes, providing resistance against invading viruses and plasmids. The identification of CRISPR loci is currently a non-standardized, ambiguous process, requiring the manual combination of multiple tools, where existing tools detect only parts of the CRISPR-systems, and lack quality control, annotation and assessment capabilities of the detected CRISPR loci. Our CRISPRloci server provides the first resource for the prediction and assessment of all possible CRISPR loci. The server integrates a series of advanced Machine Learning tools within a seamless web interface featuring: (i) prediction of all CRISPR arrays in the correct orientation; (ii) definition of CRISPR leaders for each locus; and (iii) annotation of cas genes and their unambiguous classification. As a result, CRISPRloci is able to accurately determine the CRISPR array and associated information, such as: the Cas subtypes; cassette boundaries; accuracy of the repeat structure, orientation and leader sequence; virus-host interactions; self-targeting; as well as the annotation of cas genes, all of which have been missing from existing tools. This annotation is presented in an interactive interface, making it easy for scientists to gain an overview of the CRISPR system in their organism of interest. Predictions are also rendered in GFF format, enabling in-depth genome browser inspection. In summary, CRISPRloci constitutes a full suite for CRISPR-Cas system characterization that offers annotation quality previously available only after manual inspection.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
The workflow of CRISPRloci. CRISPRloci server provides the first resources for the prediction and annotation of all possible CRISPR elements. it integrates a series of advanced machine learning tools within a seamless web interface.
Figure 1.
Figure 1.
The workflow of CRISPRloci. The workflow supports 4 different types of input. If DNA is picked as the input, CRISPRlociwill identify the CRISPR arrays, predict their orientation and the Leader sequence and then extract the repeat and spacer sequences. Repeat sequences are then analyzed for their structural stability while spacers are used to identify the potential regions of self targeting. If protein sequences are submitted as input, CRISPRloci will classify and report the protein type and role. The user can optionally input a set of repeat sequences. In this scenario, CRISPRlociwill perform a search of similar repeat sequences in the existing database. The user will be provided with the hits as well as their region, similarity and e-value. Lastly, the user can provide viral DNA as the input. In this scenario, CRISPRlociwill perform a search for the protospacers using a database of spacers. The user will be provided with the protospacer coordinates as well as the description of the host CRISPR arrays.
Figure 2.
Figure 2.
Result data and visualisation. (A) In top: CRISPRloci shows the two most thermodynamically stable secondary structure candidates, where the minimal free energy structure is highlighted in red. In the bottom: it shows the base-pair probability matrices computed by RNAfold and the averaged sub-matrices associated with the repeat structure. Additionally, when we fold the repeat with its sequence context, the corresponding structure is highlighted in green. (B) CRISPRloci provides a global overview of CRISPR–Cas systems present in the genome and visualizes the results in an interactive genome map and includes the ability to zoom in and click for additional information. (C) A table of CRISPR repeat annotation summarizing the results, including strand and subtype. The list is clickable, revealing additional information about the locus of interest, including consensus repeat sequence, array size and organisms that harbour similar CRISPRs. (D) An overview of the protospacer sequence locations in the viral/plasmid/phage genome and visualization of the results in an interactive genome map, including a full annotation of spacer sequences in the host genomes.

Similar articles

Cited by

References

    1. Barrangou R., van der Oost J.. CRISPR–Cas Systems: RNA-mediated Adaptive Immunity in Bacteria and Archaea. 2013; Heidelburg: Springer Press; 1–129.
    1. Alkhnbashi O.S., Meier T., Mitrofanov A., Backofen R., Voss B.. CRISPR–Cas bioinformatics. Methods. 2020; 172:3–11. - PubMed
    1. Lange S.J., Alkhnbashi O.S., Rose D., Will S., Backofen R.. CRISPRmap: an automated classification of repeat conservation in prokaryotic adaptive immune systems. Nucleic Acids Res. 2013; 41:8034–8044. - PMC - PubMed
    1. Makarova K.S., Wolf Y.I., Alkhnbashi O.S., Costa F., Shah S.A., Saunders S.J., Barrangou R., Brouns S. J.J., Charpentier E., Haft D.H.et al. .. An updated evolutionary classification of CRISPR–Cas systems. Nat. Rev. Microbiol. 2015; 13:722–736. - PMC - PubMed
    1. Makarova K.S., Wolf Y.I., Iranzo J., Shmakov S.A., Alkhnbashi O.S., Brouns S.J.J., Charpentier E., Cheng D., Haft D.H., Horvath P.et al. .. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants. Nat. Rev. Microbiol. 2020; 18:67–83. - PMC - PubMed

Publication types

MeSH terms

Substances