Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Jun;19(6):1565-1581.
doi: 10.1002/1878-0261.13783. Epub 2025 Feb 10.

Addressing persistent challenges in digital image analysis of cancer tissue: resources developed from a hackathon

Affiliations
Review

Addressing persistent challenges in digital image analysis of cancer tissue: resources developed from a hackathon

Sandhya Prabhakaran et al. Mol Oncol. 2025 Jun.

Abstract

The National Cancer Institute (NCI) supports numerous research consortia that rely on imaging technologies to study cancerous tissues. To foster collaboration and innovation in this field, the Image Analysis Working Group (IAWG) was created in 2019. As multiplexed imaging techniques grow in scale and complexity, more advanced computational methods are required beyond traditional approaches like segmentation and pixel intensity quantification. In 2022, the IAWG held a virtual hackathon focused on addressing challenges in analyzing complex, high-dimensional datasets from fixed cancer tissues. The hackathon addressed key challenges in three areas: (1) cell type classification and assessment, (2) spatial data visualization and translation, and (3) scaling image analysis for large, multi-terabyte datasets. Participants explored the limitations of current automated analysis tools, developed potential solutions, and made significant progress during the hackathon. Here we provide a summary of the efforts and resultant resources and highlight remaining challenges facing the research community as emerging technologies are integrated into diverse imaging modalities and data analysis platforms.

Keywords: artifact removal; artifacts; cancer; computational scalability; domain representation; image analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1
Strategies for artifact detection and correction. (A) Examples of common imaging artifacts in fluorescence microscopy. From left to right: antibody aggregates (bar = 100 μm), autofluorescent lint fibers (bar = 400 μm), air bubbles causing refractive index mismatch (bar = 125 μm), antibody hindrance (broad region of low antibody reactivity; bar = 400 μm), and out‐of‐focus tissue (bar = 50 μm). (B) CyCIF (cyclic immunofluorescence) datasets used for the artifact‐related hackathon challenges, featuring human colorectal cancer and tonsil tissue (bar = 1 mm). (C) A fibrous artifact and illumination errors are visible (left) and manually annotated (middle) to facilitate its detection and suppression (right). Scale bar represents 1 mm for all panels. (D) Receiver operating characteristic curve analysis for artifact detection performance of a multilayer perceptron trained on mean immunomarker signals alone (Features, FS1 in main text, left), or Features plus segmentation‐based nuclear morphology attributes (Nuc Morph) and pixel‐level image statistics (Pixel Thumb; FS3 in main text, right). Also see S4. (E) Comparison of before (left) and after (right) automatic artifact correction. Artifacts that have been significantly reduced or unresolved are shown with green or red boxes respectively. Regions without large artifact objects displayed similar intensity ranges across serial sections. Therefore, these either required minimal correction or were left unperturbed. Example regions highlighted with blue boxes. Scale bars represent 2 mm.
Fig. 2
Fig. 2
Spatial spillover and visual comparisons of cell type calling. (A) Example of spatial crosstalk of adjacent cells in CyCIF stained images of tonsil. Boundaries of cells identified by segmentation are indicated by the dashed cyan lines and distinct cells are numbered. Pixel intensities from different markers are indicated by distinct colors. Spatial spillover of CD3 into adjacent cells is indicated by cyan arrows. Note that the cell segmentation boundaries had been previously generated and used as‐is within the hackathon; it is possible that cells 3 & 4 may represent a single oversegmented cell. Scale bar represents 20 μm. (B) Uniform Manifold Approximation and Projection (UMAP) of cell features and spatial representation of cells in a 200 × 200px tile before and after reinforcement dynamic spillover elimination (REDSEA). A novel Cluster 5 identified by REDSEA captures isolated cells at the image border (indicated by triangles). (C) A traditional heatmap and (D) violin‐matrix of cell data separated into clusters using the hierarchical density‐based spatial clustering of applications with noise (HDBSCAN) [37] algorithm. (E) Visualizations generated by a web‐based interactive tool for inspecting and comparing clustered data in a spatial context. Scatterplots of cells in UMAP embeddings with cells colored by cluster membership based on the respective clustering algorithms (top row) and colored by silhouette coefficients (bottom row). The plots are synchronized in navigation (zooming, panning, selections).
Fig. 3
Fig. 3
Image representation learning by VAEs and for thumbnail generation. (A) Each implementation of VAE was qualitatively assessed for their ability to distinguish control (phosphate‐buffered saline, PBS)‐treated from transforming growth factor (TGF)‐β‐treated MCF10A cells using all morpho‐spatial features or the top 10 variable (var) features compared to preselecting the top 10 discriminatory (discr) features extracted from the images. Feature space is reduced to two dimensions using UMAP embedding. Class labels of TGF‐β‐ or PBS‐treated cells are shown in pink and blue, respectively. (B) Example thumbnail images. Each panel shows a thumbnail (or associated comparative plot) generated by the methods described in the main text (panel labels). All approaches were applied to a 0.9 mm2 (9 megapixels) 9‐channel cyclic immunofluorescence (CyCIF) image of a human tonsil germinal center. Scale bars represent 100 μm in all images.
Fig. 4
Fig. 4
Data processing and visualization pipeline developed during the challenge for Neuroglancer. Highly multiplexed cyclic immunofluorescence (CyCIF) data are stored as multi‐channel imaging volumes (top, left), where each volume represents one channel. For simplicity, volumes are depicted as single slices in this figure. Each volume is segmented, either via thresholding or more complex machine learning approaches and stored as binary segmentation volume (top, middle). Subsequently, for each segmentation volume (i.e., segmented channel) the geometry of the segmented structures is extracted and stored as a geometry mesh for subsequent three‐dimensional (3D) surface rendering (top, right). The visualization pipeline supports a slice view that can combine an original imaging volume with several segmentation volumes (bottom, left) and a 3D view (bottom, right). The 3D view can represent the volume as extracted meshes or a clipping plane. All scale bars represent 50 μm except in the slice view visualization where it represents 20 μm.

Update of

References

    1. Wagner RP. Rudolph Virchow and the genetic basis of somatic ecology. Genetics. 1999;151(3):917–920. - PMC - PubMed
    1. Hajdu SI. A note from history: landmarks in history of cancer, part 4. Cancer. 2012;118(20):4914–4928. - PubMed
    1. The human body at cellular resolution: The NIH human biomolecular atlas program. Nature. 2019;574(7777):187–192. - PMC - PubMed
    1. Smith JM, Conroy RM. The NIH common fund human biomolecular atlas program (HuBMAP): building a framework for mapping the human body. FASEB J. 2018;32:818.
    1. Regev A, Teichmann SA, Lander ES, Amit I, Benoist C, Birney E, et al. The human cell atlas. eLife. 2017;6:e27041. - PMC - PubMed

MeSH terms