Review

. 2025 Jun;19(6):1565-1581.

doi: 10.1002/1878-0261.13783. Epub 2025 Feb 10.

Addressing persistent challenges in digital image analysis of cancer tissue: resources developed from a hackathon

Sandhya Prabhakaran¹, Clarence Yapp², Gregory J Baker², Johanna Beyer³, Young Hwan Chang⁴, Allison L Creason⁴, Robert Krueger⁵, Jeremy Muhlich⁶, Nathan Heath Patterson⁷, Kevin Sidak⁵, Damir Sudar⁸, Adam J Taylor⁹, Luke Ternes⁴, Jakob Troidl³, Xie Yubin¹⁰, Artem Sokolov², Darren R Tyson¹¹

Affiliations

¹ Moffitt Cancer Center, Tampa, FL, USA.
² Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA.
³ School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA.
⁴ Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA.
⁵ Harvard University, Cambridge, MA, USA.
⁶ Harvard Medical School, Boston, MA, USA.
⁷ Aspect Analytics, Genk, Belgium.
⁸ Quantitative Imaging Systems, Monroeville, PA, USA.
⁹ Sage Bionetworks, Seattle, WA, USA.
¹⁰ Memorial Sloan Kettering Cancer Center, New York, NY, USA.
¹¹ Vanderbilt University School of Medicine, Nashville, TN, USA.

PMID: 39927650
PMCID: PMC12161476
DOI: 10.1002/1878-0261.13783

Review

Addressing persistent challenges in digital image analysis of cancer tissue: resources developed from a hackathon

Sandhya Prabhakaran et al. Mol Oncol. 2025 Jun.

. 2025 Jun;19(6):1565-1581.

doi: 10.1002/1878-0261.13783. Epub 2025 Feb 10.

Authors

Affiliations

¹ Moffitt Cancer Center, Tampa, FL, USA.
² Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA.
³ School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA.
⁴ Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA.
⁵ Harvard University, Cambridge, MA, USA.
⁶ Harvard Medical School, Boston, MA, USA.
⁷ Aspect Analytics, Genk, Belgium.
⁸ Quantitative Imaging Systems, Monroeville, PA, USA.
⁹ Sage Bionetworks, Seattle, WA, USA.
¹⁰ Memorial Sloan Kettering Cancer Center, New York, NY, USA.
¹¹ Vanderbilt University School of Medicine, Nashville, TN, USA.

PMID: 39927650
PMCID: PMC12161476
DOI: 10.1002/1878-0261.13783

Abstract

The National Cancer Institute (NCI) supports numerous research consortia that rely on imaging technologies to study cancerous tissues. To foster collaboration and innovation in this field, the Image Analysis Working Group (IAWG) was created in 2019. As multiplexed imaging techniques grow in scale and complexity, more advanced computational methods are required beyond traditional approaches like segmentation and pixel intensity quantification. In 2022, the IAWG held a virtual hackathon focused on addressing challenges in analyzing complex, high-dimensional datasets from fixed cancer tissues. The hackathon addressed key challenges in three areas: (1) cell type classification and assessment, (2) spatial data visualization and translation, and (3) scaling image analysis for large, multi-terabyte datasets. Participants explored the limitations of current automated analysis tools, developed potential solutions, and made significant progress during the hackathon. Here we provide a summary of the efforts and resultant resources and highlight remaining challenges facing the research community as emerging technologies are integrated into diverse imaging modalities and data analysis platforms.

Keywords: artifact removal; artifacts; cancer; computational scalability; domain representation; image analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Fig. 1**
Strategies for artifact detection and correction. (A) Examples of common imaging artifacts in fluorescence microscopy. From left to right: antibody aggregates (bar = 100 μm), autofluorescent lint fibers (bar = 400 μm), air bubbles causing refractive index mismatch (bar = 125 μm), antibody hindrance (broad region of low antibody reactivity; bar = 400 μm), and out‐of‐focus tissue (bar = 50 μm). (B) CyCIF (cyclic immunofluorescence) datasets used for the artifact‐related hackathon challenges, featuring human colorectal cancer and tonsil tissue (bar = 1 mm). (C) A fibrous artifact and illumination errors are visible (*left*) and manually annotated (*middle*) to facilitate its detection and suppression (*right*). Scale bar represents 1 mm for all panels. (D) Receiver operating characteristic curve analysis for artifact detection performance of a multilayer perceptron trained on mean immunomarker signals alone (Features, FS1 in main text, *left*), or Features plus segmentation‐based nuclear morphology attributes (Nuc Morph) and pixel‐level image statistics (Pixel Thumb; FS3 in main text, *right*). Also see S4. (E) Comparison of before (*left*) and after (*right*) automatic artifact correction. Artifacts that have been significantly reduced or unresolved are shown with green or red boxes respectively. Regions without large artifact objects displayed similar intensity ranges across serial sections. Therefore, these either required minimal correction or were left unperturbed. Example regions highlighted with blue boxes. Scale bars represent 2 mm.

**Fig. 2**
Spatial spillover and visual comparisons of cell type calling. (A) Example of spatial crosstalk of adjacent cells in CyCIF stained images of tonsil. Boundaries of cells identified by segmentation are indicated by the dashed cyan lines and distinct cells are numbered. Pixel intensities from different markers are indicated by distinct colors. Spatial spillover of CD3 into adjacent cells is indicated by cyan arrows. Note that the cell segmentation boundaries had been previously generated and used as‐is within the hackathon; it is possible that cells 3 & 4 may represent a single oversegmented cell. Scale bar represents 20 μm. (B) Uniform Manifold Approximation and Projection (UMAP) of cell features and spatial representation of cells in a 200 × 200px tile before and after reinforcement dynamic spillover elimination (REDSEA). A novel Cluster 5 identified by REDSEA captures isolated cells at the image border (indicated by triangles). (C) A traditional heatmap and (D) violin‐matrix of cell data separated into clusters using the hierarchical density‐based spatial clustering of applications with noise (HDBSCAN) [37] algorithm. (E) Visualizations generated by a web‐based interactive tool for inspecting and comparing clustered data in a spatial context. Scatterplots of cells in UMAP embeddings with cells colored by cluster membership based on the respective clustering algorithms (*top row*) and colored by silhouette coefficients (*bottom row*). The plots are synchronized in navigation (zooming, panning, selections).

**Fig. 3**
Image representation learning by VAEs and for thumbnail generation. (A) Each implementation of VAE was qualitatively assessed for their ability to distinguish control (phosphate‐buffered saline, PBS)‐treated from transforming growth factor (TGF)‐β‐treated MCF10A cells using all morpho‐spatial features or the top 10 variable (var) features compared to preselecting the top 10 discriminatory (discr) features extracted from the images. Feature space is reduced to two dimensions using UMAP embedding. Class labels of TGF‐β‐ or PBS‐treated cells are shown in pink and blue, respectively. (B) Example thumbnail images. Each panel shows a thumbnail (or associated comparative plot) generated by the methods described in the main text (panel labels). All approaches were applied to a 0.9 mm² (9 megapixels) 9‐channel cyclic immunofluorescence (CyCIF) image of a human tonsil germinal center. Scale bars represent 100 μm in all images.

**Fig. 4**
Data processing and visualization pipeline developed during the challenge for Neuroglancer. Highly multiplexed cyclic immunofluorescence (CyCIF) data are stored as multi‐channel imaging volumes (top, left), where each volume represents one channel. For simplicity, volumes are depicted as single slices in this figure. Each volume is segmented, either via thresholding or more complex machine learning approaches and stored as binary segmentation volume (top, middle). Subsequently, for each segmentation volume (i.e., segmented channel) the geometry of the segmented structures is extracted and stored as a geometry mesh for subsequent three‐dimensional (3D) surface rendering (top, right). The visualization pipeline supports a slice view that can combine an original imaging volume with several segmentation volumes (bottom, left) and a 3D view (bottom, right). The 3D view can represent the volume as extracted meshes or a clipping plane. All scale bars represent 50 μm except in the slice view visualization where it represents 20 μm.

See this image and copyright information in PMC

Update of

Addressing persistent challenges in digital image analysis of cancerous tissues.
Prabhakaran S, Yapp C, Baker GJ, Beyer J, Chang YH, Creason AL, Krueger R, Muhlich J, Patterson NH, Sidak K, Sudar D, Taylor AJ, Ternes L, Troidl J, Xie Y, Sokolov A, Tyson DR; Cell Imaging Hackathon 2022 Participants. Prabhakaran S, et al. bioRxiv [Preprint]. 2023 Jul 24:2023.07.21.548450. doi: 10.1101/2023.07.21.548450. bioRxiv. 2023. Update in: Mol Oncol. 2025 Jun;19(6):1565-1581. doi: 10.1002/1878-0261.13783. PMID: 37547011 Free PMC article. Updated. Preprint.

References

1. Wagner RP. Rudolph Virchow and the genetic basis of somatic ecology. Genetics. 1999;151(3):917–920. - PMC - PubMed
1. Hajdu SI. A note from history: landmarks in history of cancer, part 4. Cancer. 2012;118(20):4914–4928. - PubMed
1. The human body at cellular resolution: The NIH human biomolecular atlas program. Nature. 2019;574(7777):187–192. - PMC - PubMed
1. Smith JM, Conroy RM. The NIH common fund human biomolecular atlas program (HuBMAP): building a framework for mapping the human body. FASEB J. 2018;32:818.
1. Regev A, Teichmann SA, Lander ES, Amit I, Benoist C, Birney E, et al. The human cell atlas. eLife. 2017;6:e27041. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Addressing persistent challenges in digital image analysis of cancer tissue: resources developed from a hackathon

Affiliations

Addressing persistent challenges in digital image analysis of cancer tissue: resources developed from a hackathon

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Update of

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Research Materials