Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug 31;14(9):849-863.
doi: 10.1038/nmeth.4397.

Data-analysis strategies for image-based cell profiling

Affiliations

Data-analysis strategies for image-based cell profiling

Juan C Caicedo et al. Nat Methods. .

Abstract

Image-based cell profiling is a high-throughput strategy for the quantification of phenotypic differences among a variety of cell populations. It paves the way to studying biological systems on a large scale by using chemical and genetic perturbations. The general workflow for this technology involves image acquisition with high-throughput microscopy systems and subsequent image processing and analysis. Here, we introduce the steps required to create high-quality image-based (i.e., morphological) profiles from a collection of microscopy images. We recommend techniques that have proven useful in each stage of the data analysis process, on the basis of the experience of 20 laboratories worldwide that are refining their image-based cell-profiling methodologies in pursuit of biological discovery. The recommended techniques cover alternatives that may suit various biological goals, experimental designs, and laboratories' preferences.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Representative workflow for image-based cell profiling.
Eight main steps transform images into quantitative information to support experimental conclusions.
Figure 2
Figure 2. Methods used for image analysis.
(a) Illumination-correction function estimated with a retrospective multi-image method. Pixels in the center of the field of view are systematically brighter than pixels in the edges. (b) Image segmentation aims to classify pixels as either foreground or background, i.e. as being part of an object or not. Here, regions have been segmented with the model-based approach.
Figure 3
Figure 3. Diversity of feature distributions in morphological profiling.
(ah) Morphological features display various types of distributions, including normal (a), skewed (b,c), uniform (d), multimodal (eg), and even discrete distributions (h). The ranges in which features are represented also vary considerably. These histograms were obtained with feature values from a sample of 10,000 cells in the BBBC021 data set. The names of features correspond to conventions used in the CellProfiler software. The x axes show feature values (in different units), and the y axes show frequencies (cell counts).
Figure 4
Figure 4. Example diagnostic plots for detecting batch effects and plate-layout effects.
(a) Process of detecting batch effects. The largest matrix on the right shows how plates 1 and 2 are more correlated to each other than to plates 3 and 4, and vice versa. This pattern suggests that plates 1 and 2, as well as 3 and 4, were prepared in batches that have noticeable differences in their experimental conditions. (b) Two plate layouts illustrating the cell count in each well. The visualization allows for identification of plate-layout effects, such as unfavorable edge conditions. Plate 1 shows that cells can grow normally in any well, whereas plate 2 shows markedly lower cell counts at the edges, thus indicating the presence of experimental artifacts.
Figure 5
Figure 5. Single-cell data aggregation.
The feature matrices of two treatments show the measurements of their cell populations in the experiment. These measurements have been collapsed into median profiles that show very distinct signatures corresponding to two selected compounds: etoposide and floxuridine.
Figure 6
Figure 6. Visualizations for downstream analysis.
The source data are from 148,649 cells from the BBBC021 data set. (a) A heat map of the distance matrix shows the correlations between all pairs of samples, by using sample-level data (described in 'Measuring profile similarity'). (bd) The cellular heterogeneity landscape can be visualized from single-cell data by using PCA (b), tSNE scatter plots (c) or a SPADE tree (d). In these examples, single-cell data points are colored according to a single-cell shape feature 'cytoplasm area shape extent' (red, high; blue, low). (e) A separate visualization for each treatment can assist in interpreting phenotypic changes induced by sample treatments. A constant SPADE tree is shown, and treatment-induced shifts in the number of cells in each 'node' of the tree are shown by the color scale depicted. The first three treatments are known to have a similar functional effect (Aurora kinase inhibition), and they exhibit similar cell distributions on the SPADE tree. The remaining three treatments are known to induce protein degradation, inducing cell distributions that differ from the first three.

References

    1. Boutros M, Heigwer F, Laufer C. Microscopy-based high-content screening. Cell. 2015;163:1314–1325. doi: 10.1016/j.cell.2015.11.007. - DOI - PubMed
    1. Mattiazzi Usaj M, et al. High-content screening for quantitative cell biology. Trends Cell Biol. 2016;26:598–611. doi: 10.1016/j.tcb.2016.03.008. - DOI - PubMed
    1. Fetz V, Prochnow H, Brönstrup M, Sasse F. Target identification by image analysis. Nat. Prod. Rep. 2016;33:655–667. doi: 10.1039/C5NP00113G. - DOI - PubMed
    1. Pennisi E. 'Cell painting' highlights responses to drugs and toxins. Science. 2016;352:877–878. doi: 10.1126/science.352.6288.877. - DOI - PubMed
    1. Grys BT, et al. Machine learning and computer vision approaches for phenotypic profiling. J. Cell Biol. 2017;216:65–71. doi: 10.1083/jcb.201610026. - DOI - PMC - PubMed

MeSH terms