Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 6;13(1):2.
doi: 10.1186/s13073-020-00809-3.

Improved analysis of CRISPR fitness screens and reduced off-target effects with the BAGEL2 gene essentiality classifier

Affiliations

Improved analysis of CRISPR fitness screens and reduced off-target effects with the BAGEL2 gene essentiality classifier

Eiru Kim et al. Genome Med. .

Abstract

Background: Identifying essential genes in genome-wide loss-of-function screens is a critical step in functional genomics and cancer target finding. We previously described the Bayesian Analysis of Gene Essentiality (BAGEL) algorithm for accurate classification of gene essentiality from short hairpin RNA and CRISPR/Cas9 genome-wide genetic screens.

Results: We introduce an updated version, BAGEL2, which employs an improved model that offers a greater dynamic range of Bayes Factors, enabling detection of tumor suppressor genes; a multi-target correction that reduces false positives from off-target CRISPR guide RNA; and the implementation of a cross-validation strategy that improves performance ~ 10× over the prior bootstrap resampling approach. We also describe a metric for screen quality at the replicate level and demonstrate how different algorithms handle lower quality data in substantially different ways.

Conclusions: BAGEL2 substantially improves the sensitivity, specificity, and performance over BAGEL and establishes the new state of the art in the analysis of CRISPR knockout fitness screens. BAGEL2 is written in Python 3 and source code, along with all supporting files, are available on github ( https://github.com/hart-lab/bagel ).

PubMed Disclaimer

Conflict of interest statement

TH is a consultant for Repare Therapeutics. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Improvement of BAGEL algorithm. a A brief flow diagram of CRISPR pooled library screen analysis using BAGEL pipeline. b Improvements in the model selection algorithm. The red and the blue curves indicate kernel density plots of fold changes of reference core-essential and non-essential genes, respectively. The gray curve indicates the ratio (difference in logs) of core-essential density to non-essential density at the point of fold change. Since there are few data points in the marginal area, BAGEL limited calculation area of fold change between the point that blue curve hits the density threshold (2−7 was used in BAGEL) as a lower bound and the first local minimum ratio as an upper bound (red dashed lines). In BAGEL2, we employed linear regression to interpolate marginal area outside this region (black line). c Comparison of gene essentiality (Bayes Factor) between BAGEL and BAGEL2 using RPE-1 cell line screened by TKOv3. Known tumor suppressors (NF2, KIRREL, and KEAP1) that are scored BF ~ − 20 with hundreds of other genes in BAGEL were measured as much lower Bayes Factor and distinguished clearly from others in BAGEL2. d Dynamic range of BAGEL2 results were increased from BAGEL across screens in the Avana dataset. e Jaccard index between predicted essential gene sets in Avana by 10-fold cross-validation and bootstrapping. f Pearson correlation coefficient of essentiality across 517 cell lines in Avana data between frequently amplified genes near ERBB2 on chromosome 17. After CRISPRcleanR is applied, essentiality correlation due to copy number amplification effect was successfully corrected. g Prediction performance benchmark between BAGEL, BAGEL2 applied linear interpolation and 10-fold cross-validation (BAGEL2 Raw), and BAGEL2 + CRISPRcleanR applied version (BAGEL2 CCR applied)
Fig. 2
Fig. 2
The multi-targeting effect correction reduces false positives from off-targets with 1-bp mismatch. a, b Increment of Bayes Factors of multi-targeting gRNAs but targeting only a single protein-coding gene in comparison with Bayes Factor of gRNAs targeting the protein-coding gene without any other targets a before the multi-targeting effect correction and b after the multi-targeting effect correction. c The number of essential genes across good quality cell lines (F-measure > 0.85) in the Avana dataset predicted by BAGEL2 with or without CRISPRcleanR and other algorithms, CERES, MAGeCK, and JACKS with cut-off threshold BF 10, BF 7, score − 0.6, FDR 0.15, and p value 0.001, respectively. The cut-off threshold was aimed for obtaining similar numbers of essential genes. d The number false positives predicted by each algorithm. False positives were defined by non-expressed genes in RNA-seq data of corresponding cell lines. BAGEL2 after multi-targeting effect correction shows comparable results with CERES and much lower numbers than results of MAGeCK and JACKS. e The number of false positives in predicted essential genesets when the scope is limited to genes having gRNAs mapped over than five 1-bp mismatched targets that are likely from multi-targeting effects of 1-bp mismatched targets. The result of BAGEL2 after correction shows the best performance among algorithms
Fig. 3
Fig. 3
Variable screen performance. a Kernel density estimates of reference core-essential genes (red) and non-essential genes (blue) with SNU761 replicate B as an example of good screen (upper panel) and U178 replicate A as an example of marginal screen (lower panel). The good screen shows clear separation between core-essential and non-essential curves whereas the marginal screen shows less separation. b The equation of quality score. c Mean quality score of replicates and F-measure in cell line level shows clear correlation trends and differentiated by replicate screen counts per cell line. d Project Score CRISPR screen data recapitulated and followed the same trends of Avana set. e, f Relationship between the number of false positives in e BAGEL2 results and f CERES results across 517 cell lines in Avana data. Each dot colored by the number of essential genes

References

    1. Zhou Y, Zhu S, Cai C, Yuan P, Li C, Huang Y, et al. High-throughput screening of a CRISPR/Cas9 library for functional genomics in human cells. Nature. 2014;509:487–491. doi: 10.1038/nature13166. - DOI - PubMed
    1. Shalem O, Sanjana NE, Hartenian E, Shi X, Scott DA, Mikkelsen TS, et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science. 2014;343:84–87. doi: 10.1126/science.1247005. - DOI - PMC - PubMed
    1. Hart T, Chandrashekhar M, Aregger M, Steinhart Z, Brown KR, MacLeod G, et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell. 2015;163:1515–1526. doi: 10.1016/j.cell.2015.11.015. - DOI - PubMed
    1. Evers B, Jastrzebski K, Heijmans JPM, Grernrum W, Beijersbergen RL, Bernards R. CRISPR knockout screening outperforms shRNA and CRISPRi in identifying essential genes. Nat Biotechnol. 2016;34:631–633. doi: 10.1038/nbt.3536. - DOI - PubMed
    1. Meyers RM, Bryan JG, McFarland JM, Weir BA, Sizemore AE, Xu H, et al. Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nat Genet. 2017;49:1779–1784. doi: 10.1038/ng.3984. - DOI - PMC - PubMed

Publication types