Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 30;9(1):33.
doi: 10.1038/s41698-025-00799-8.

A comprehensive evaluation of histopathology foundation models for ovarian cancer subtype classification

Affiliations

A comprehensive evaluation of histopathology foundation models for ovarian cancer subtype classification

Jack Breen et al. NPJ Precis Oncol. .

Abstract

Histopathology foundation models show great promise across many tasks, but analyses have been limited by arbitrary hyperparameters. We report the most rigorous single-task validation study to date, specifically in the context of ovarian carcinoma morphological subtyping. Attention-based multiple instance learning classifiers were compared using three ImageNet-pretrained encoders and fourteen foundation models, each trained with 1864 whole slide images and validated through hold-out testing and two external validations (the Transcanadian Study and OCEAN Challenge). The best-performing classifier used the H-optimus-0 foundation model, with balanced accuracies of 89%, 97%, and 74%, though UNI achieved similar results at a quarter of the computational cost. Hyperparameter tuning the classifiers improved performance by a median 1.9% balanced accuracy, with many improvements being statistically significant. Foundation models improve classification performance and may allow for clinical utility, with models providing a second opinion in challenging cases and potentially improving the accuracy and efficiency of diagnoses.

PubMed Disclaimer

Conflict of interest statement

Competing interests: N.M.O.’s fellowship is funded by 4D Path. All other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Classification model pipeline.
Attention-based multiple instance learning (ABMIL) classifier for ovarian cancer subtyping, showing the classification of a high-grade serous carcinoma (HGSC).
Fig. 2
Fig. 2. Ovarian cancer subtyping results.
The mean and 95% confidence interval generated by 10,000 iterations of bootstrapping for each metric. Blue indicates ImageNet-pretrained feature extractors and orange indicates histopathology foundation models. Hold-out testing and external validation results are based on an ensemble of cross-validation models. Precise values are provided in Supplementary Tables 3–6.
Fig. 3
Fig. 3. Optimal confusion matrices.
The confusion matrix from each validation for the optimal ABMIL classifier with features from the H-optimus-0 foundation model. Correct classifications are indicated in green.
Fig. 4
Fig. 4. Model inference times.
The average inference time per WSI for each model, including tissue patch extraction, feature encoding, and ABMIL classification time.
Fig. 5
Fig. 5. Preprocessing analysis results.
Comparison of the balanced accuracy for each ImageNet-pretrained feature extractor (blue), the seven ResNet50 models with varied preprocessing techniques (green), and the three worst-performing (RN18-Histo, RN50-Histo, and CTransPath) and the single best-performing foundation models (H-optimus-0) in (a) cross-validation, (b) hold-out testing, (c) external validation on the Transcanadian Study dataset, (d) external validation on the OCEAN Challenge dataset. For validations (b)–(d), predictions were ensembled from the five cross-validation models. Results reported as the mean and 95% confidence interval generated by 10,000 iterations of bootstrapping. Precise values and other metrics are presented in Supplementary Tables 12–15.
Fig. 6
Fig. 6. Validation losses.
The average validation loss from five-fold cross-validation for each model across each hyperparameter tuning iteration.
Fig. 7
Fig. 7. Results of hyperparameter tuning.
The balanced accuracy compared for each ABMIL model trained with the default hyperparameters (pink) and the tuned hyperparameters (blue) in (a) cross-validation, (b) hold-out testing, (c) external validation on the Transcanadian Study dataset, and (d) external validation on the OCEAN Challenge dataset. For validations (b)–(d), predictions were ensembled from the five cross-validation models. *Indicates a significant difference in the paired t-test at the 5% significance level.
Fig. 8
Fig. 8. Accuracy compared to efficiency.
Balanced accuracy results for each histopathology foundation model-based classifier in each validation shown in relation to the number of model parameters and number of WSIs used in the pretraining of the foundation model. The line of best fit and the corresponding coefficient of determination (R2) are provided for each validation.
Fig. 9
Fig. 9. Attention heatmaps.
Example attention heatmaps from the ABMIL classifier using the ImageNet-pretrained ResNet50 and UNI foundation model features. (Upper) A typical difference between heatmaps with different diagnoses. (Lower) The most extreme qualitative difference found between heatmaps in the internal test set. In both examples, the UNI classification was correct (upper—MC, lower—CCC), and the ResNet50 classification was incorrect (upper—EC, lower—MC). These heatmaps are based on 256 × 256 pixel patches with 50% overlap at 10× apparent magnification, with visual differences in scale caused by the variable size of resection samples.

References

    1. Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin.74, 229–263 (2024). - PubMed
    1. Köbel, M. et al. Ovarian carcinoma subtypes are different diseases: implications for biomarker studies. PLoS Med.5, e232 (2008). - DOI - PMC - PubMed
    1. Peres, L. C. et al. Invasive epithelial ovarian cancer survival by histotype and disease stage. J. Natl Cancer Inst.111, 60–68 (2019). - DOI - PMC - PubMed
    1. Moch, H. Female genital tumours. WHO Classification of Tumours, Vol. 4 (WHO, 2020).
    1. Vroobel, K. Overview of ovarian tumours: pathogenesis and general considerations. In Pathology of the Ovary, Fallopian Tube and Peritoneum, 95–113 (Springer, 2024).

LinkOut - more resources