This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2025 Jun 11:2025.06.10.25328700.

doi: 10.1101/2025.06.10.25328700.

Multicenter Histology Image Integration and Multiscale Deep Learning for Machine Learning-Enabled Pediatric Sarcoma Classification

Adam Thiesen^{1

2}, Sergii Domanskyi¹, Ali Foroughi Pour¹, Jingyan Zhang^{1

3}, Todd B Sheridan⁴, Steven B Neuhauser¹, Alyssa Stetson⁵, Katelyn Dannheim⁵, Danielle B Cameron⁵, Shawn Ahn⁶, Hao Wu⁷, Emily R Christison Lagay⁷, Carol J Bult⁸, Jeffrey H Chuang^{1

2}, Jill C Rubinstein^{1

4

2}

Affiliations

¹ The Jackson Laboratory for Genomic Medicine.
² UConn School of Medicine.
³ Johns Hopkins University.
⁴ Hartford Healthcare.
⁵ Harvard School of Medicine.
⁶ Department of Surgery, University of Pennsylvania.
⁷ Yale School of Medicine.
⁸ The Jackson Laboratory for Mammalian Genetics.

PMID: 40585079
PMCID: PMC12204438
DOI: 10.1101/2025.06.10.25328700

Multicenter Histology Image Integration and Multiscale Deep Learning for Machine Learning-Enabled Pediatric Sarcoma Classification

Adam Thiesen et al. medRxiv. 2025.

[Preprint]. 2025 Jun 11:2025.06.10.25328700.

doi: 10.1101/2025.06.10.25328700.

Authors

Affiliations

¹ The Jackson Laboratory for Genomic Medicine.
² UConn School of Medicine.
³ Johns Hopkins University.
⁴ Hartford Healthcare.
⁵ Harvard School of Medicine.
⁶ Department of Surgery, University of Pennsylvania.
⁷ Yale School of Medicine.
⁸ The Jackson Laboratory for Mammalian Genetics.

PMID: 40585079
PMCID: PMC12204438
DOI: 10.1101/2025.06.10.25328700

Abstract

Pediatric sarcomas present diagnostic challenges due to their rarity and diverse subtypes, often requiring specialized pathology expertise and costly genetic tests. To overcome these barriers, we developed a computational pipeline leveraging deep learning methods to accurately classify pediatric sarcoma subtypes from digitized histology slides. To ensure classifier generalizability and minimize center-specific artifacts, we collected and harmonized a dataset comprising 867 whole slide images (WSIs) from three medical centers and the Children's Oncology Group (COG). Multiple convolutional neural network (CNN) and vision transformer (ViT) architectures were systematically evaluated as feature extractors for SAMPLER-based WSI representations, and input parameters such as tile size combinations and resolutions were tested and optimized. Our analysis showed that advanced ViT foundation models (UNI, CONCH) significantly outperformed earlier approaches, and incorporating multiscale features can enhance classification accuracy. Our optimized models achieved high performance, distinguishing rhabdomyosarcoma (RMS) from non-rhabdomyosarcoma (NRSTS) with an AUC of 0.969±0.026 and differentiating RMS subtypes (alveolar vs. embryonal) with an AUC of 0.961±0.021. Additionally, a two-stage pipeline effectively identified scarce Ewing sarcoma images from other NRSTS (AUC 0.929). Compared to conventional transformer encoder architectures used for WSI representations, our SAMPLER based classifiers were more lightweight (0.111 MB vs. 1.9 MB) and three orders of magnitude faster to train. This study highlights that digital histopathology paired with rigorous image harmonization provides a powerful solution for pediatric sarcoma classification. Our models reduce inter-observer variability, augment diagnostic precision, and have the potential to increase global accessibility to robust diagnostics, improving time to diagnosis and subsequent treatment planning.

PubMed Disclaimer

Figures

**Figure 1.. Workflow for processing whole slide images (WSI).**
The process begins with converting WSIs into a tifffile-readable format. The Automatic Object Identification (AOI) tool is then used to identify regions of interest (ROI) within the tissue samples, generating JSON files. Files of histology artifacts are filtered out. The STQ pipeline is executed in “arbitrary grid” mode to process each image and generate focus maps, tiling grid, imaging and nuclear morphometric features. The focus maps are examined for quality, with out-of-focus regions re-scanned or excluded. Finally, a SAMPLER WSI representation is generated for each tissue, followed by downstream classification analyses. Quality control has three distinct stages indicated with yellow pointers.

**Figure 2.. Benchmarking of deep learning backbones and image scales for pediatric alveolar vs embryonal sarcoma histological subtype classification.**
A, Mean area-under-ROC curve values for each of the 4 backbones at each FOV. B, Mean AUROCs from 5-fold cross validation with fixed random seeds for pairwise combinations of feature sets, where each feature set derives from a FOV for a given backbone. C, ROC curve for the best performing backbone combination, as measured over 100 5-fold cross-validation iterations. D, Plot of precision, recall, and F1 score curves for the best performing backbone combination (CONCH, FOV 1+2). E, Confusion matrix using the optimal threshold, averaged across all cross-validation iterations and expressed as percentages. F, Precision, recall, F1 score, and accuracy for the best performing backbone combination.

**Figure 3.. Alternative workflow using MHSA transformer architecture for pediatric sarcoma histological subtype classification.**
A, As a comparison to the SAMPLER and logistic-regression approach, we implemented the MHSA Transformer architecture to learn a WSI representation based on the interactions between tiles and then make classification predictions. B, ROC curve for the MHSA Transformer model in the task of Alveolar vs Embryonal classification. C, Averaged confusion matrix at threshold 0.5 for the Transformer. D, Precision, recall, F1 score, and accuracy metrics for the Transformer model.

**Figure 4.. Benchmarking of deep learning backbones and image scales for pediatric RMS vs NRSTS sarcoma histological type classification.**
A, Comparison of mean AUCs for the 4 deep learning backbones at each FOV. B, Comparison of combinations of concatenated FOV features for each backbone, mean AUC. C, ROC curve for the best performing backbone combination over 100 iterations. D, Plot of precision, recall, and F1 score curves for the best performing backbone combination. E, Confusion matrix across all fold test sets, using the optimal threshold, expressed as percentages. F, Precision, recall, F1 score, and accuracy for the best performing backbone combination.

**Figure 5.. Two-stage classification of Ewing sarcoma.**
A, Schematic showing the two-stage classification workflow, where models are trained on all images in the dataset, and predicted NRSTS cases are moved to the second stage of classification. B, Confusion matrix for the combined stages is shown at threshold 0.8 (B). C, ROC curve for the first stage of classification NRSTS vs RMS. Any sample classified as NRSTS then moves to the next round for further subtyping. D, ROC curve for Ewing vs non-Ewing classification. **E/F/G,** Precision, recall, F1 score, accuracy and support for the two models.

**Figure 6.. Spatial inference of histological types and subtypes of sarcoma.**
Panels A-C represent different subtypes of non-rhabdomyosarcoma soft tissue sarcomas (NRSTS) and rhabdomyosarcomas (RMS). Each panel includes three images: Hematoxylin and Eosin (H&E) stained tissue section (left), probability heatmap of NRSTS (middle), and probability heatmap of the specific subtype (right). A, Synovial NRSTS. B, Alveolar RMS. C, Ewing NRSTS. The heatmaps illustrate the spatial distribution of contributing features, which may aid in pathologist visual assessment and further validation by other assays.

See this image and copyright information in PMC

References

1. Florou V., Nascimento A. G., Gulia A. & de Lima Lopes G. Global Health Perspective in Sarcomas and Other Rare Cancers. Am. Soc. Clin. Oncol. Educ. Book 916–924 (2018) doi: 10.1200/EDBK_200589. - DOI - PubMed
1. Children’s Oncology Group’s 2023. Blueprint for Research: Soft Tissue Sarcomas - PMC. https://pmc.ncbi.nlm.nih.gov/articles/PMC10519430/. - PMC - PubMed
1. Grünewald T. G. et al. Sarcoma treatment in the era of molecular medicine. EMBO Mol. Med. 12, e11131 (2020). - PMC - PubMed
1. Rhabdomyosarcoma: Updates on classification and the necessity of molecular testing beyond immunohistochemistry - ClinicalKey. https://www.clinicalkey.com/#!/content/playContent/1-s2.0-S0046817723002.... - PubMed
1. Williamson D. et al. Fusion Gene–Negative Alveolar Rhabdomyosarcoma Is Clinically and Molecularly Indistinguishable From Embryonal Rhabdomyosarcoma. J. Clin. Oncol. 28, 2151–2158 (2010). - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

Multicenter Histology Image Integration and Multiscale Deep Learning for Machine Learning-Enabled Pediatric Sarcoma Classification

Affiliations

Multicenter Histology Image Integration and Multiscale Deep Learning for Machine Learning-Enabled Pediatric Sarcoma Classification

Authors

Affiliations

Abstract

Figures

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous