Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Aug 7:arXiv:2508.05800v1.

Progress and new challenges in image-based profiling

Affiliations

Progress and new challenges in image-based profiling

Erik Serrano et al. ArXiv. .

Abstract

For over two decades, image-based profiling has revolutionized cellular phenotype analysis. Image-based profiling processes rich, high-throughput, microscopy data into unbiased measurements that reveal phenotypic patterns powerful for drug discovery, functional genomics, and cell state classification. Here, we review the evolving computational landscape of image-based profiling, detailing current procedures, discussing limitations, and highlighting future development directions. Deep learning has fundamentally reshaped image-based profiling, improving feature extraction, scalability, and multimodal data integration. Methodological advancements such as single-cell analysis and batch effect correction, drawing inspiration from single-cell transcriptomics, have enhanced analytical precision. The growth of open-source software ecosystems and the development of community-driven standards have further democratized access to image-based profiling, fostering reproducibility and collaboration across research groups. Despite these advancements, the field still faces significant challenges requiring innovative solutions. By focusing on the technical evolution of image-based profiling rather than the wide-ranging biological applications, our aim with this review is to provide researchers with a roadmap for navigating the progress and new challenges in this rapidly advancing domain.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest NOC is co-founder, shareholder and management consultant for PhenoTherapeutics Ltd. S.S. and A.E.C. serve as scientific advisors for companies that use image-based profiling and Cell Painting (A.E.C: Recursion, SyzOnc, Quiver Bioscience, S.S.: Waypoint Bio, Dewpoint Therapeutics, Deepcell) and receive honoraria for occasional scientific visits to pharmaceutical and biotechnology companies. All other authors declare that they have no conflict of interest.

Figures

Figure 1:
Figure 1:. The comprehensive image-based profiling workflow.
A complete image-based profiling workflow spans from A) experimental design, sample preparations and image acquisition to the generation of digitized cellular representations and B) their downstream processing steps.
Figure 2.
Figure 2.. Deep learning in image-based profiling.
There are two primary model architectures for deep learning based feature extraction: (A) Convolutional Neural Networks (CNNs) process images through stacked convolutional layers to extract local features in a hierarchical manner. (B) Vision Transformers (ViTs) process images by dividing them into patches, which are linearly embedded and combined with positional encodings to retain spatial information. Patch embeddings (and an aggregating class token) are then processed by a series of transformer blocks that leverage self-attention mechanisms for global context understanding. The features of the class token are then used for downstream analysis. (C) A generalized pipeline for deep learning in image-based profiling consists of three core steps and two optional steps: (1) model selection, where users choose an architecture and training strategy to construct a model or adopt a pre-trained one; (2) optional segmentation, where cell centroids or masks are produced; (3) feature extraction, producing either image-level embeddings from full fields of views or single-cell representations guided via previously calculated segmentations; (4) optional aggregation, commonly applied to single-cell data by combining features by treatment, well, or image; and (5) normalization and batch correction, a crucial step to account for technical variation and ensure comparability across experiments.
Figure 3.
Figure 3.. Leveraging single-cell resolution in image-based profiling to uncover cellular heterogeneity and improve hit detection.
(A) Comparison between traditional population-averaged profiling and single-cell approaches. While aggregate profiles may obscure heterogeneous cellular responses, single-cell analysis reveals distinct subpopulations, as illustrated by diverging distributions in cumulative density plots. (B) Overview of a representative single-cell image-based profiling pipeline, encompassing key stages from image acquisition and feature extraction to cell-level data normalization and quality control. (C) Application of statistical tests, such as the Kolmogorov–Smirnov (KS) statistic, and distance-based metrics, such as Earth Mover’s Distance (EMD), to cumulative density functions for detecting perturbations that induce significant shifts in single-cell feature distributions, enabling more sensitive and nuanced hit detection.

References

    1. Aggarwal CC, Hinneburg A & Keim DA (2001) On the surprising behavior of distance metrics in high dimensional space. In Database Theory — ICDT 2001 pp 420–434. Berlin, Heidelberg: Springer Berlin Heidelberg
    1. Aghayev Z, Szafran AT, Tran A, Ganesh HS, Stossi F, Zhou L, Mancini MA, Pistikopoulos EN & Beykal B (2023) Machine learning methods for endocrine disrupting potential identification based on single-cell data. Chemical Engineering Science 281: 119086 - PMC - PubMed
    1. Aleksander SA, Balhoff J, Carbon S, Michael J, Drabkin HJ, Ebert D, Feuermann M, Gaudet P, Harris NL, Hill DP, et al. (2023) The Gene Ontology knowledgebase in 2023. Genetics 224: iyad031 - PMC - PubMed
    1. Krizhevsky Alex et al. (2017) ImageNet classification with deep convolutional neural networks. Communications of the ACM
    1. Alieva M, Wezenaar AKL, Wehrens EJ & Rios AC (2023) Bridging live-cell imaging and next-generation cancer treatment. Nature Reviews Cancer 23: 731–745 - PubMed

Publication types

LinkOut - more resources