. 2018 Jan 23;14(1):e8064.

doi: 10.15252/msb.20178064.

Large-scale image-based profiling of single-cell phenotypes in arrayed CRISPR-Cas9 gene perturbation screens

Reinoud de Groot¹, Joel Lüthi^{1

2}, Helen Lindsay¹, René Holtackers¹, Lucas Pelkmans³

Affiliations

¹ Institute of Molecular Life Sciences, University of Zürich, Zürich, Switzerland.
² Systems Biology PhD program, Life Science Zürich Graduate School ETH Zürich and University of Zürich, Zürich, Switzerland.
³ Institute of Molecular Life Sciences, University of Zürich, Zürich, Switzerland lucas.pelkmans@imls.uzh.ch.

PMID: 29363560
PMCID: PMC5787707
DOI: 10.15252/msb.20178064

Large-scale image-based profiling of single-cell phenotypes in arrayed CRISPR-Cas9 gene perturbation screens

Reinoud de Groot et al. Mol Syst Biol. 2018.

. 2018 Jan 23;14(1):e8064.

doi: 10.15252/msb.20178064.

Authors

Reinoud de Groot¹, Joel Lüthi^{1

2}, Helen Lindsay¹, René Holtackers¹, Lucas Pelkmans³

Affiliations

¹ Institute of Molecular Life Sciences, University of Zürich, Zürich, Switzerland.
² Systems Biology PhD program, Life Science Zürich Graduate School ETH Zürich and University of Zürich, Zürich, Switzerland.
³ Institute of Molecular Life Sciences, University of Zürich, Zürich, Switzerland lucas.pelkmans@imls.uzh.ch.

PMID: 29363560
PMCID: PMC5787707
DOI: 10.15252/msb.20178064

Abstract

High-content imaging using automated microscopy and computer vision allows multivariate profiling of single-cell phenotypes. Here, we present methods for the application of the CISPR-Cas9 system in large-scale, image-based, gene perturbation experiments. We show that CRISPR-Cas9-mediated gene perturbation can be achieved in human tissue culture cells in a timeframe that is compatible with image-based phenotyping. We developed a pipeline to construct a large-scale arrayed library of 2,281 sequence-verified CRISPR-Cas9 targeting plasmids and profiled this library for genes affecting cellular morphology and the subcellular localization of components of the nuclear pore complex (NPC). We conceived a machine-learning method that harnesses genetic heterogeneity to score gene perturbations and identify phenotypically perturbed cells for in-depth characterization of gene perturbation effects. This approach enables genome-scale image-based multivariate gene perturbation profiling using CRISPR-Cas9.

Keywords: CRISPR‐Cas9; arrayed library; functional genomics; nuclear pore complex; single‐cell phenotypic profiling.

PubMed Disclaimer

Figures

**Figure 1. CRISPR‐Cas9‐mediated gene perturbation by transient transfection of targeting plasmids**
Schematic overview of CRISPR‐Cas9‐mediated gene perturbation by transient transfection of a targeting plasmid. tdTomato expression (magenta) marks transfected cells. Single‐cell measurements are obtained by quantitative immunofluorescence (green) combined with computer vision and automated cell segmentation, see text for details.
tdTomato (magenta) and TFRC (green) expression in HeLa cells transfected with a control plasmid, or a *TFRC* targeting plasmid. Scale bar, 50 μm.
Quantification of normalized TFRC staining per cell, 1–4 days after transfection of a *TFRC* targeting plasmid. Violin plots of normalized TFRC staining intensity in all analysed cells (grey) or tdTomato expressing (T(+), magenta) cells.
Quantification of the efficacy of genetic perturbation by *TFRC*, *LAMP1* and *YAP1* targeting plasmids; bars indicate the percentage of genetically perturbed T(+) cells. The mean ± standard deviation of three independent experiments is displayed.
Evaluation of genetic perturbations in single cells using bDNA FISH. Schematic representation of the expected phenotype in wild‐type and functionally genetically perturbed cells.
bDNA FISH staining of *TFRC* mRNA in HeLa cells transfected with a control plasmid, or a *TFRC* targeting plasmid. Cell outlines are indicated and colour‐coded white for T(−) cells, magenta for T(+) cells. Scale bar, 50 μm.
Quantification *TFRC* mRNA spots in cells transfected with a control plasmid, or a *TFRC* targeting plasmid. Violin plots of *TFRC* mRNA spot counts per T(+) cell.
Heatmap representation of the efficacy of targeting plasmids designed to perturb 26 selected genes as assayed by smFISH.

**Figure EV1. Functional genetic perturbation of human cells by transient transfection of targeting plasmids**
Immunofluorescence staining of LAMP1 in HeLa cells transfected with a control plasmid, or a *LAMP1* targeting plasmid. Scale bar, 50 μm. Violin plots of normalized mean LAMP1 staining intensity in tdTomato expressing (T(+)) cells 4 days post‐transfection.
Immunofluorescence staining of YAP1 in HeLa cells transfected with a control plasmid, or a *YAP1* targeting plasmid. Scale bar, 50 μm. Violin plots of normalized mean YAP1 staining intensity in tdTomato expressing (T(+)) cells 4 days post‐transfection.
Immunofluorescence staining of LAMP1 in U2OS cells transfected with a control plasmid, or a *LAMP1* targeting plasmid. Scale bar, 50 μm. Violin plots of normalized mean TFRC staining intensity in tdTomato expressing (T(+)) cells 4 days post‐transfection.
Rational selection of highly functional gRNA sequences, see main text and material and methods for details.

**Figure EV2. A large‐scale arrayed CRISPR‐Cas9 screening library**
Schematic representation of the workflow for the construction of an arrayed CRISPR‐Cas9 screening library. A pool of oligos is synthesized and cloned into the vector backbone in a single reaction. Single colonies are picked into multi‐well plates. The gRNA sequence of every colony is PCR amplified with primers that introduce barcodes to identify the row, column and plate of the well where the colony is located. The sequence of the PCR products is analysed in a deep sequencing reaction. Unique, mutation‐free colonies are selected, re‐arrayed and miniprepped to generate an arrayed CRISPR‐Cas9 screening library.
Representation of the number of picked colonies, the number of mutation‐free identified gRNAs, the number of unique mutation‐free gRNAs, the number of re‐arrayed colonies and the number of constructs in the arrayed CRISPR‐Cas9 screening library.
Pie chart representing the number of genes targeted by 1, 2 or 3 targeting plasmids.
Network representation of the arrayed CRISPR‐Cas9 screening library. Nodes represent selected Gene Ontology annotations of targeted genes, node size represents the number of genes with the functional annotation, edges indicate genes sharing functional annotations. Edge thickness scales with number of genes that share functional annotations. Nodes are colour coded for phosphorylation‐related functional annotations (red) or ubiquitylation‐related functional annotations (blue).

**Figure EV3. CRISPR‐Cas9 gene perturbation profiling in HeLa cells**
Nucleus and cell segmentation based on image processing and computer vision of cells stained for DNA and total protein. Scale bar, 50 μm.
Nucleus, cell, cytoplasm and nuclear periphery segmentation and mAb414 staining for the large‐scale CRISPR‐Cas9 gene perturbation profiling experiment of the mAb414 staining pattern. Scale bar, 50 μm.
Scatterplot of misclassification errors of classifiers trained on cells transfected with plasmids targeting *HSPA5*, NPC components and non‐targeting controls from two independent experiments.

**Figure 2. CRISPR‐Cas9 gene perturbation profiling and identification of phenotypically perturbed cells**
A
Schematic representation of the profiling of CRISPR‐Cas9 gene perturbation phenotypes. Transient transfection of a targeting plasmid results in a mixed population of wild‐type and genetically perturbed cells. Technical well‐to‐well variability can be accounted for by standardizing single‐cell features to the wild‐type cell population in every well. Logistic regression classifiers are fitted to the cell population to attempt to distinguish between T(+) and T(−) cells based on a set of single‐cell features.
B, C
The predicted value (PV) is calculated for every cell in a well that was transiently transfected with a control targeting plasmid, or a *NUP160* targeting plasmid. A positive PV indicates classification into the phenotypically perturbed class. The dotted line indicates the threshold for further single‐cell characterization [PV > 0.62 (mean + 3 × standard deviation of non‐targeting control cells)].
D
Immunofluorescence image of mAb414 staining in HeLa cells transfected with a *NUP160* targeting plasmid. Cell outlines are coloured orange for T(+) cells that show a gene perturbation phenotype (PV > 0.62), red for T(+) cells with a PV < 0.62, blue for T(−) cells. Missegmented cells are outlined grey. Scale bar, 50 μm.
E, F
tSNE projection of cells transfected with a *NUP160* targeting plasmid. Single cells are colour coded according to tdTomato expression (E) and PV (F).

**Figure 3. Large‐scale image‐based CRISPR‐Cas9 gene perturbation profiling**
Image‐based profiling of the arrayed CRISPR‐Cas9 library for perturbations affecting cellular morphology and total protein staining features. The classification score is a linear transformation of the misclassification error of logistic regression models trained to classify T(+) and T(−) cells. Perturbations targeting proteasome subunits or structural components of the NPC are colour‐coded purple and green. Non‐targeting control perturbations are colour‐coded brown. The dotted line indicates the threshold used to select perturbations that have a higher classification score than non‐targeting controls (third quartile + 1.5 × interquartile range of the classification scores of non‐targeting controls). The size of the perturbation nodes is scaled according to the phenotypic score, which reflects the KS statistic calculated between the PV distributions of non‐targeting control plasmid transfected cells and the transfected cells of the respective perturbation (see Materials and Methods).
Image‐based profiling of mAb414 staining pattern. Colour coding and threshold calculation as in (A).
Hierarchical clustering of the standardized mean feature profiles of control cells or phenotypically perturbed cells transfected with plasmids targeting *HSPA5* or selected structural components of the NPC.
Immunofluorescence images and schematic representation of the mAb414 staining pattern in control cells or phenotypically perturbed cells from the *NUP62*, *HSPA5*, *NUP133, NUP107*, *NUP160* or *NUP98* targeted populations. Scale bar, 10 μm.

**Figure EV4. Large‐scale image‐based CRISPR‐Cas9 gene perturbation profiling**
Network representation of selected GO terms associated with perturbations identified in the profiling of cell morphology and total protein staining features. Edges between nodes are formed if GO terms share genes. Node size represents enrichment of GO terms relative to the screening library, and the P‐value is calculated using a hypergeometric test.
Network representation of GO terms associated with perturbations identified in the profiling of the mAb414 staining pattern. Node size represents enrichment of GO terms relative to the screening library, and P‐values are calculated using a hypergeometric test.
Hierarchical clustering of the mean feature values of phenotypically perturbed cells from populations of cells transfected with plasmids targeting proteasome subunits and mean feature profiles of cells transfected with non‐targeting control plasmids. The mean feature profiles were calculated based on all features used in the cell morphology profiling.

**Figure EV5. Phenotypic profiling by between‐well comparison of feature profiles**
A, C
The mean mAb414 feature profiles (A) or cell morphology features (C) were calculated for T(+) and T(−) cells per well. For each profile, the Mahalanobis distance from the distribution of all profiles was calculated. Nodes represent feature profiles, colour‐coded magenta and grey for profiles obtained from T(+) and T(−) cells, respectively. The dotted line indicates the threshold used to select perturbations have a large distance to non‐targeting controls (third quartile + 3 × interquartile range of the distance of non‐targeting controls). Nodes are scaled according to the classification score which is based on the within‐well comparison of T(+) and T(−) cells.
B, D
The Mahalanobis distance of T(+) profiles from the total distribution of mean feature profiles was plotted against the classification score (as obtained from within‐well comparison of T(+) and T(−) cells) for the profiling of the mAb414 features (B) and cell morphology features (D).

**Figure EV6. Mean feature profiles of targeted cells are highly consistent between gRNA sequences**
Boxplots of the standardized single‐cell feature values of all transfected cells and phenotypically perturbed cells transfected with plasmids targeting *NUP62, HSPA5, NUP107* or *NUP98,* bar graph representation of the percentage of T(+) cells with a PV > 0.62. Boxes indicate the 1^st and 3^rd quartile of the data distribution. The whiskers indicate the maximum and minimum datapoints within the 1^st quartile minus 1.5 times the interquartile range (IQR) of the data and the third quartile plus 1.5 times the IQR.
Schematic representation of the NPC, adapted from Weberruss and Antonin (Weberruss & Antonin, 2016).
Cells were transfected with three independent plasmids targeting each of the genes *NUP62*, *HSPA5*, *NUP107* or *NUP98*. Mean feature profiles were obtained from all transfected cells, or the subset of T(+) cells with a high PV. The Pearson correlation coefficient between pairs of profiles obtained from populations targeted for the same gene with different plasmids was calculated. The correlations between profiles obtained from all transfected cells, or the subset of T(+) cells with a high PV are compared.
Boxplots of Pearson's correlation coefficients calculated between mean feature profiles of phenotypically perturbed cells transfected with plasmids targeting the same gene, or different genes. Phenotypic profiles were obtained from cells transfected with plasmids targeting selected subunits of the NPC and HSPA5 (green) or the top‐scoring genes that were identified in the large‐scale profiling of the mAb414 staining features for which multiple targeting plasmids were present in the library (grey). Boxes indicate the 1^st and 3^rd quartile of the data distribution. The whiskers indicate the maximum and minimum datapoints within the 1^st quartile minus 1.5 times the interquartile range (IQR) of the data and the third quartile plus 1.5 times the IQR.

See this image and copyright information in PMC

Cited by

Subcellular proteomics.
Christopher JA, Stadler C, Martin CE, Morgenstern M, Pan Y, Betsinger CN, Rattray DG, Mahdessian D, Gingras AC, Warscheid B, Lehtiö J, Cristea IM, Foster LJ, Emili A, Lilley KS. Christopher JA, et al. Nat Rev Methods Primers. 2021;1:32. doi: 10.1038/s43586-021-00029-y. Epub 2021 Apr 29. Nat Rev Methods Primers. 2021. PMID: 34549195 Free PMC article.
Transitive prediction of small-molecule function through alignment of high-content screening resources.
Bao F, Li L, Hammerlindl H, Shen SQ, Hammerlindl S, Altschuler SJ, Wu LF. Bao F, et al. Nat Biotechnol. 2025 Jul 11. doi: 10.1038/s41587-025-02729-2. Online ahead of print. Nat Biotechnol. 2025. PMID: 40646169
Blank spots on the map: some current questions on nuclear organization and genome architecture.
Adriaens C, Serebryannyy LA, Feric M, Schibler A, Meaburn KJ, Kubben N, Trzaskoma P, Shachar S, Vidak S, Finn EH, Sood V, Pegoraro G, Misteli T. Adriaens C, et al. Histochem Cell Biol. 2018 Dec;150(6):579-592. doi: 10.1007/s00418-018-1726-1. Epub 2018 Sep 20. Histochem Cell Biol. 2018. PMID: 30238154 Free PMC article. Review.
Rapid identification of human mast cell degranulation regulators using functional genomics coupled to high-resolution confocal microscopy.
Folkerts J, Gaudenzio N, Maurer M, Hendriks RW, Stadhouders R, Tam SY, Galli SJ. Folkerts J, et al. Nat Protoc. 2020 Mar;15(3):1285-1310. doi: 10.1038/s41596-019-0288-6. Epub 2020 Feb 14. Nat Protoc. 2020. PMID: 32060492 Free PMC article.
A statistical simulation model to guide the choices of analytical methods in arrayed CRISPR screen experiments.
Kim CS, Cairns J, Quarantotti V, Kaczkowski B, Wang Y, Konings P, Zhang X. Kim CS, et al. PLoS One. 2024 Aug 20;19(8):e0307445. doi: 10.1371/journal.pone.0307445. eCollection 2024. PLoS One. 2024. PMID: 39163294 Free PMC article.

See all "Cited by" articles

References

1. Adamson B, Norman TM, Jost M, Cho MY, Nuñez JK, Chen Y, Villalta JE, Gilbert LA, Horlbeck MA, Hein MY, Pak RA, Gray AN, Gross CA, Dixit A, Parnas O, Regev A, Weissman JS (2016) A multiplexed single‐cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167: 1867–1882 - PMC - PubMed
1. Battich N, Stoeger T, Pelkmans L (2013) Image‐based transcriptomics in thousands of single human cells at single‐molecule resolution. Nat Methods 10: 1127–1133 - PubMed
1. Boutros M, Ahringer J (2008) The art and design of genetic screens: RNA interference. Nat Rev Genet 9: 554–566 - PubMed
1. Boutros M, Heigwer F, Laufer C (2015) Microscopy‐based high‐content screening. Cell 163: 1314–1325 - PubMed
1. Caicedo JC, Singh S, Carpenter AE (2016) Applications in image‐based profiling of perturbations. Curr Opin Biotechnol 39: 134–142 - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Research Materials
- Addgene Non-profit plasmid repository

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Large-scale image-based profiling of single-cell phenotypes in arrayed CRISPR-Cas9 gene perturbation screens

Affiliations

Large-scale image-based profiling of single-cell phenotypes in arrayed CRISPR-Cas9 gene perturbation screens

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials