Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 23;14(1):e8064.
doi: 10.15252/msb.20178064.

Large-scale image-based profiling of single-cell phenotypes in arrayed CRISPR-Cas9 gene perturbation screens

Affiliations

Large-scale image-based profiling of single-cell phenotypes in arrayed CRISPR-Cas9 gene perturbation screens

Reinoud de Groot et al. Mol Syst Biol. .

Abstract

High-content imaging using automated microscopy and computer vision allows multivariate profiling of single-cell phenotypes. Here, we present methods for the application of the CISPR-Cas9 system in large-scale, image-based, gene perturbation experiments. We show that CRISPR-Cas9-mediated gene perturbation can be achieved in human tissue culture cells in a timeframe that is compatible with image-based phenotyping. We developed a pipeline to construct a large-scale arrayed library of 2,281 sequence-verified CRISPR-Cas9 targeting plasmids and profiled this library for genes affecting cellular morphology and the subcellular localization of components of the nuclear pore complex (NPC). We conceived a machine-learning method that harnesses genetic heterogeneity to score gene perturbations and identify phenotypically perturbed cells for in-depth characterization of gene perturbation effects. This approach enables genome-scale image-based multivariate gene perturbation profiling using CRISPR-Cas9.

Keywords: CRISPR‐Cas9; arrayed library; functional genomics; nuclear pore complex; single‐cell phenotypic profiling.

PubMed Disclaimer

Figures

Figure 1
Figure 1. CRISPR‐Cas9‐mediated gene perturbation by transient transfection of targeting plasmids
  1. Schematic overview of CRISPR‐Cas9‐mediated gene perturbation by transient transfection of a targeting plasmid. tdTomato expression (magenta) marks transfected cells. Single‐cell measurements are obtained by quantitative immunofluorescence (green) combined with computer vision and automated cell segmentation, see text for details.

  2. tdTomato (magenta) and TFRC (green) expression in HeLa cells transfected with a control plasmid, or a TFRC targeting plasmid. Scale bar, 50 μm.

  3. Quantification of normalized TFRC staining per cell, 1–4 days after transfection of a TFRC targeting plasmid. Violin plots of normalized TFRC staining intensity in all analysed cells (grey) or tdTomato expressing (T(+), magenta) cells.

  4. Quantification of the efficacy of genetic perturbation by TFRC, LAMP1 and YAP1 targeting plasmids; bars indicate the percentage of genetically perturbed T(+) cells. The mean ± standard deviation of three independent experiments is displayed.

  5. Evaluation of genetic perturbations in single cells using bDNA FISH. Schematic representation of the expected phenotype in wild‐type and functionally genetically perturbed cells.

  6. bDNA FISH staining of TFRC mRNA in HeLa cells transfected with a control plasmid, or a TFRC targeting plasmid. Cell outlines are indicated and colour‐coded white for T(−) cells, magenta for T(+) cells. Scale bar, 50 μm.

  7. Quantification TFRC mRNA spots in cells transfected with a control plasmid, or a TFRC targeting plasmid. Violin plots of TFRC mRNA spot counts per T(+) cell.

  8. Heatmap representation of the efficacy of targeting plasmids designed to perturb 26 selected genes as assayed by smFISH.

Figure EV1
Figure EV1. Functional genetic perturbation of human cells by transient transfection of targeting plasmids
  1. Immunofluorescence staining of LAMP1 in HeLa cells transfected with a control plasmid, or a LAMP1 targeting plasmid. Scale bar, 50 μm. Violin plots of normalized mean LAMP1 staining intensity in tdTomato expressing (T(+)) cells 4 days post‐transfection.

  2. Immunofluorescence staining of YAP1 in HeLa cells transfected with a control plasmid, or a YAP1 targeting plasmid. Scale bar, 50 μm. Violin plots of normalized mean YAP1 staining intensity in tdTomato expressing (T(+)) cells 4 days post‐transfection.

  3. Immunofluorescence staining of LAMP1 in U2OS cells transfected with a control plasmid, or a LAMP1 targeting plasmid. Scale bar, 50 μm. Violin plots of normalized mean TFRC staining intensity in tdTomato expressing (T(+)) cells 4 days post‐transfection.

  4. Rational selection of highly functional gRNA sequences, see main text and material and methods for details.

Figure EV2
Figure EV2. A large‐scale arrayed CRISPR‐Cas9 screening library
  1. Schematic representation of the workflow for the construction of an arrayed CRISPR‐Cas9 screening library. A pool of oligos is synthesized and cloned into the vector backbone in a single reaction. Single colonies are picked into multi‐well plates. The gRNA sequence of every colony is PCR amplified with primers that introduce barcodes to identify the row, column and plate of the well where the colony is located. The sequence of the PCR products is analysed in a deep sequencing reaction. Unique, mutation‐free colonies are selected, re‐arrayed and miniprepped to generate an arrayed CRISPR‐Cas9 screening library.

  2. Representation of the number of picked colonies, the number of mutation‐free identified gRNAs, the number of unique mutation‐free gRNAs, the number of re‐arrayed colonies and the number of constructs in the arrayed CRISPR‐Cas9 screening library.

  3. Pie chart representing the number of genes targeted by 1, 2 or 3 targeting plasmids.

  4. Network representation of the arrayed CRISPR‐Cas9 screening library. Nodes represent selected Gene Ontology annotations of targeted genes, node size represents the number of genes with the functional annotation, edges indicate genes sharing functional annotations. Edge thickness scales with number of genes that share functional annotations. Nodes are colour coded for phosphorylation‐related functional annotations (red) or ubiquitylation‐related functional annotations (blue).

Figure EV3
Figure EV3. CRISPR‐Cas9 gene perturbation profiling in HeLa cells
  1. Nucleus and cell segmentation based on image processing and computer vision of cells stained for DNA and total protein. Scale bar, 50 μm.

  2. Nucleus, cell, cytoplasm and nuclear periphery segmentation and mAb414 staining for the large‐scale CRISPR‐Cas9 gene perturbation profiling experiment of the mAb414 staining pattern. Scale bar, 50 μm.

  3. Scatterplot of misclassification errors of classifiers trained on cells transfected with plasmids targeting HSPA5, NPC components and non‐targeting controls from two independent experiments.

Figure 2
Figure 2. CRISPR‐Cas9 gene perturbation profiling and identification of phenotypically perturbed cells
  1. A

    Schematic representation of the profiling of CRISPR‐Cas9 gene perturbation phenotypes. Transient transfection of a targeting plasmid results in a mixed population of wild‐type and genetically perturbed cells. Technical well‐to‐well variability can be accounted for by standardizing single‐cell features to the wild‐type cell population in every well. Logistic regression classifiers are fitted to the cell population to attempt to distinguish between T(+) and T(−) cells based on a set of single‐cell features.

  2. B, C

    The predicted value (PV) is calculated for every cell in a well that was transiently transfected with a control targeting plasmid, or a NUP160 targeting plasmid. A positive PV indicates classification into the phenotypically perturbed class. The dotted line indicates the threshold for further single‐cell characterization [PV > 0.62 (mean + 3 × standard deviation of non‐targeting control cells)].

  3. D

    Immunofluorescence image of mAb414 staining in HeLa cells transfected with a NUP160 targeting plasmid. Cell outlines are coloured orange for T(+) cells that show a gene perturbation phenotype (PV > 0.62), red for T(+) cells with a PV < 0.62, blue for T(−) cells. Missegmented cells are outlined grey. Scale bar, 50 μm.

  4. E, F

    tSNE projection of cells transfected with a NUP160 targeting plasmid. Single cells are colour coded according to tdTomato expression (E) and PV (F).

Figure 3
Figure 3. Large‐scale image‐based CRISPR‐Cas9 gene perturbation profiling
  1. Image‐based profiling of the arrayed CRISPR‐Cas9 library for perturbations affecting cellular morphology and total protein staining features. The classification score is a linear transformation of the misclassification error of logistic regression models trained to classify T(+) and T(−) cells. Perturbations targeting proteasome subunits or structural components of the NPC are colour‐coded purple and green. Non‐targeting control perturbations are colour‐coded brown. The dotted line indicates the threshold used to select perturbations that have a higher classification score than non‐targeting controls (third quartile + 1.5 × interquartile range of the classification scores of non‐targeting controls). The size of the perturbation nodes is scaled according to the phenotypic score, which reflects the KS statistic calculated between the PV distributions of non‐targeting control plasmid transfected cells and the transfected cells of the respective perturbation (see Materials and Methods).

  2. Image‐based profiling of mAb414 staining pattern. Colour coding and threshold calculation as in (A).

  3. Hierarchical clustering of the standardized mean feature profiles of control cells or phenotypically perturbed cells transfected with plasmids targeting HSPA5 or selected structural components of the NPC.

  4. Immunofluorescence images and schematic representation of the mAb414 staining pattern in control cells or phenotypically perturbed cells from the NUP62, HSPA5, NUP133, NUP107, NUP160 or NUP98 targeted populations. Scale bar, 10 μm.

Figure EV4
Figure EV4. Large‐scale image‐based CRISPR‐Cas9 gene perturbation profiling
  1. Network representation of selected GO terms associated with perturbations identified in the profiling of cell morphology and total protein staining features. Edges between nodes are formed if GO terms share genes. Node size represents enrichment of GO terms relative to the screening library, and the P‐value is calculated using a hypergeometric test.

  2. Network representation of GO terms associated with perturbations identified in the profiling of the mAb414 staining pattern. Node size represents enrichment of GO terms relative to the screening library, and P‐values are calculated using a hypergeometric test.

  3. Hierarchical clustering of the mean feature values of phenotypically perturbed cells from populations of cells transfected with plasmids targeting proteasome subunits and mean feature profiles of cells transfected with non‐targeting control plasmids. The mean feature profiles were calculated based on all features used in the cell morphology profiling.

Figure EV5
Figure EV5. Phenotypic profiling by between‐well comparison of feature profiles
  1. A, C

    The mean mAb414 feature profiles (A) or cell morphology features (C) were calculated for T(+) and T(−) cells per well. For each profile, the Mahalanobis distance from the distribution of all profiles was calculated. Nodes represent feature profiles, colour‐coded magenta and grey for profiles obtained from T(+) and T(−) cells, respectively. The dotted line indicates the threshold used to select perturbations have a large distance to non‐targeting controls (third quartile + 3 × interquartile range of the distance of non‐targeting controls). Nodes are scaled according to the classification score which is based on the within‐well comparison of T(+) and T(−) cells.

  2. B, D

    The Mahalanobis distance of T(+) profiles from the total distribution of mean feature profiles was plotted against the classification score (as obtained from within‐well comparison of T(+) and T(−) cells) for the profiling of the mAb414 features (B) and cell morphology features (D).

Figure EV6
Figure EV6. Mean feature profiles of targeted cells are highly consistent between gRNA sequences
  1. Boxplots of the standardized single‐cell feature values of all transfected cells and phenotypically perturbed cells transfected with plasmids targeting NUP62, HSPA5, NUP107 or NUP98, bar graph representation of the percentage of T(+) cells with a PV > 0.62. Boxes indicate the 1st and 3rd quartile of the data distribution. The whiskers indicate the maximum and minimum datapoints within the 1st quartile minus 1.5 times the interquartile range (IQR) of the data and the third quartile plus 1.5 times the IQR.

  2. Schematic representation of the NPC, adapted from Weberruss and Antonin (Weberruss & Antonin, 2016).

  3. Cells were transfected with three independent plasmids targeting each of the genes NUP62, HSPA5, NUP107 or NUP98. Mean feature profiles were obtained from all transfected cells, or the subset of T(+) cells with a high PV. The Pearson correlation coefficient between pairs of profiles obtained from populations targeted for the same gene with different plasmids was calculated. The correlations between profiles obtained from all transfected cells, or the subset of T(+) cells with a high PV are compared.

  4. Boxplots of Pearson's correlation coefficients calculated between mean feature profiles of phenotypically perturbed cells transfected with plasmids targeting the same gene, or different genes. Phenotypic profiles were obtained from cells transfected with plasmids targeting selected subunits of the NPC and HSPA5 (green) or the top‐scoring genes that were identified in the large‐scale profiling of the mAb414 staining features for which multiple targeting plasmids were present in the library (grey). Boxes indicate the 1st and 3rd quartile of the data distribution. The whiskers indicate the maximum and minimum datapoints within the 1st quartile minus 1.5 times the interquartile range (IQR) of the data and the third quartile plus 1.5 times the IQR.

Similar articles

Cited by

References

    1. Adamson B, Norman TM, Jost M, Cho MY, Nuñez JK, Chen Y, Villalta JE, Gilbert LA, Horlbeck MA, Hein MY, Pak RA, Gray AN, Gross CA, Dixit A, Parnas O, Regev A, Weissman JS (2016) A multiplexed single‐cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167: 1867–1882 - PMC - PubMed
    1. Battich N, Stoeger T, Pelkmans L (2013) Image‐based transcriptomics in thousands of single human cells at single‐molecule resolution. Nat Methods 10: 1127–1133 - PubMed
    1. Boutros M, Ahringer J (2008) The art and design of genetic screens: RNA interference. Nat Rev Genet 9: 554–566 - PubMed
    1. Boutros M, Heigwer F, Laufer C (2015) Microscopy‐based high‐content screening. Cell 163: 1314–1325 - PubMed
    1. Caicedo JC, Singh S, Carpenter AE (2016) Applications in image‐based profiling of perturbations. Curr Opin Biotechnol 39: 134–142 - PubMed

Publication types