Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun;630(8015):181-188.
doi: 10.1038/s41586-024-07441-w. Epub 2024 May 22.

A whole-slide foundation model for digital pathology from real-world data

Affiliations

A whole-slide foundation model for digital pathology from real-world data

Hanwen Xu et al. Nature. 2024 Jun.

Abstract

Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles1-3. Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context4. Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet5 method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data6. With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision-language pretraining for pathology7,8 by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling.

PubMed Disclaimer

Conflict of interest statement

C.B. is a member of the scientific advisory board and owns stock in PrimeVax and BioAI; is on the scientific board of Lunaphore and SironaDx; has a consultant or advisory relationship with Sanofi, Agilent, Roche and Incendia; contributes to institutional research for Illumina, and is an inventor on US patent applications US20180322632A1 (Image Processing Systems and Methods for Displaying Multiple Images of a Biological Specimen) filed by Ventana Medical Systems, Providence Health and Services Oregon and US20200388033A1 (System and Method for Automatic Labeling of Pathology Images) filed by Providence Health and Services Oregon, Omics Data Automation. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of Prov-GigaPath.
a, Flow chart showing the model architecture of Prov-GigaPath. Prov-GigaPath first serializes each input WSI into a sequence of 256 × 256 image tiles in row-major order and uses an image tile-level encoder to convert each image tile into a visual embedding. Then Prov-GigaPath applies a slide-level encoder based on the LongNet architecture to generate contextualized embeddings, which can serve as the basis for various downstream applications. b, Image tile-level pretraining using DINOv2. c, Slide-level pretraining with LongNet using masked autoencoder. [CLS] is the classification token.
Fig. 2
Fig. 2. Gene mutation prediction.
aj, Bar plots comparing the AUROC and AUPRC scores of Prov-GigaPath and competing methods on pan-cancer 18-biomarker (a,f), LUAD-specific 5-gene mutation prediction (b,g), pan-cancer 5-gene mutation prediction (c,h), LUAD-specific 5-gene mutation prediction on TCGA (d,i) and pan-cancer TMB prediction (e,j). k, Bar plot showing AUROC for each gene on LUAD-specific five-gene mutation prediction on TCGA. ak, Data are mean ± s.e.m. across n = 10 independent experiments. The listed P value indicates the significance for Prov-GigaPath outperforming the best comparison approach, with one-sided Wilcoxon test. l, Comparison of AUROC scores for individual biomarkers in pan-cancer 18-biomarker predictions.
Fig. 3
Fig. 3. Comparison on cancer subtyping.
af, Bar plots comparing cancer subtyping performance in terms of AUROC (a,c,e) and balanced accuracy (b,d,f) on nine cancer types. Data are mean ± s.e.m. across n = 10 independent experiments. The listed P value indicates the significance for Prov-GigaPath outperforming the best comparison approach, with one-sided Wilcoxon test. BACC, balanced accuracy. BRCA, breast invasive carcinoma; CNS, central nervous system; COADREAD, colorectal adenocarcinoma; DIFG, diffuse intrinsic pontine glioma; EGC, early gastric cancer; HB, hepatobiliary; NSCLC, non-small cell lung cancer; OVT, ovarian cancer; RCC, renal cell cancer.
Fig. 4
Fig. 4. Comparison on image–report alignment.
a, Flow chart showing the fine-tuning of Prov-GigaPath using pathology reports. Real-world pathology reports are processed using GPT-3.5 from OpenAI to remove information irrelevant to cancer diagnosis. We performed the CLIP-based contrastive learning to align Prov-GigaPath and PubMedBERT. b, The fine-tuned Prov-GigaPath can then be used to perform zero-shot cancer subtyping and mutation prediction. The input of Prov-GigaPath is a sequence of tiles segmented from a WSI, and the inputs of the text encoder PubMedBERT are manually designed prompts representing cancer types and mutations. Based on the output of Prov-GigaPath and PubMedBERT, we can calculate the probability of the input WSI being classified into specific cancer subtypes and mutations. c, Bar plots comparing zero-shot subtyping performance on NSCLC and COADREAD in terms of BACC, precision and f1. d, Bar plots comparing the performance on mutation prediction using the fine-tuned model for six genes. c,d, Data are mean ± s.e.m. across n = 50 experiments. The listed P value indicates the significance for Prov-GigaPath outperforming the best comparison approach, with one-sided Wilcoxon test. e, Scatter plots comparing the performance between Prov-GigaPath and MI-Zero in terms of BACC on zero-shot cancer subtyping. Each dot indicates one trial with a particular set of text query formulations.
Extended Data Fig. 1
Extended Data Fig. 1. Comparison on Pan-cancer 18-biomarker prediction.
Bar plot showing the AUPRC score for each biomarker on the 18-biomarker prediction by Prov-GigaPath and competing methods.
Extended Data Fig. 2
Extended Data Fig. 2. Comparison on LUAD 5-gene mutation prediction.
Bar plots showing AUROC and AUPRC scores for predicting each gene mutation on LUAD 5-gene mutation prediction. The error bars show the standard error across n = 10 independent experiments and the bar centre shows the mean value. The listed p-value indicates the significance level that Prov-GigaPath outperforms the best comparison approach, with one-sided Wilcoxon test.
Extended Data Fig. 3
Extended Data Fig. 3. Comparison on Pan-cancer 5-gene mutation prediction.
Bar plots showing AUROC and AUPRC scores for predicting each gene mutation on Pan-cancer 5-gene mutation prediction. The error bars show the standard error across n = 10 independent experiments and the bar centre shows the mean value. The listed p-value indicates the significance level that Prov-GigaPath outperforms the best comparison approach, with one-sided Wilcoxon test.
Extended Data Fig. 4
Extended Data Fig. 4. Comparison on LUAD 5-gene mutation prediction in TCGA.
Bar plots showing AUPRC scores for predicting each gene mutation on LUAD 5-gene mutation prediction in TCGA. The error bars show the standard error across n = 10 independent experiments and the bar centre shows the mean value. The listed p-value indicates the significance level that Prov-GigaPath outperforms the best comparison approach, with one-sided Wilcoxon test.
Extended Data Fig. 5
Extended Data Fig. 5. Comparison on mutation prediction on new colorectal patients.
Bar plots showing AUROC and AUPRC scores for predicting 5-gene mutation and TMB status on new patients from Providence. The error bars show the standard error across n = 10 independent experiments and the bar centre shows the mean value. The listed p-value indicates the significance level that Prov-GigaPath outperforms the best comparison approach, with one-sided Wilcoxon test.
Extended Data Fig. 6
Extended Data Fig. 6. Comparison between pretraining the same model using Prov-Path and TCGA.
a-b, Bar plots showing the AUROC (a) and AURPC (b) on LUAD 5-gene mutation prediction in TCGA using models trained on Prov-Path and TCGA. Prov-GigaPath is GigaPath trained on Prov-Path. GigaPath-TCGA is GigaPath trained on TCGA. The error bars show the standard error across n = 10 independent experiments and the bar centre shows the mean value. The listed p-value indicates the significance level that Prov-GigaPath outperforms GigaPath-TCGA, with one-sided Wilcoxon test.
Extended Data Fig. 7
Extended Data Fig. 7. Comparison between GigaPath trained using Prov-Path and HIPT trained using Prov-Path on mutation prediction.
a-j: Bar plots showing the AUROC (a-e) and AURPC (f-j) of mutation prediction tasks by Prov-GigaPath and HIPT-Prov-Path. HIPT-Prov-Path indicates HIPT pretrained on Prov-Path. The error bars show the standard error across n = 10 independent experiments and the bar centre shows the mean value. The listed p-value indicates the significance level that Prov-GigaPath outperforms the HIPT-Prov-Path, with one-sided Wilcoxon test.
Extended Data Fig. 8
Extended Data Fig. 8. Comparison between GigaPath trained using Prov-Path and HIPT trained using Prov-Path on cancer subtyping.
a-f, Bar plots showing the AUROC (a,c,e) and BACC (b,d,f) of cancer subtyping tasks by Prov-GigaPath and HIPT-Prov-Path. HIPT-Prov-Path indicates HIPT pretrained on Prov-Path. The error bars show the standard error across n = 10 independent experiments and the bar centre shows the mean value. The listed p-value indicates the significance level that Prov-GigaPath outperforms the HIPT-Prov-Path, with one-sided Wilcoxon test.
Extended Data Fig. 9
Extended Data Fig. 9. Alignment between pathology reports and images.
a-d, Bar plots showing the performance of f1 (a), Precision (b), AUROC (c) and AUPRC (d) using fine-tuned Prov-GigaPath to predict mutations in the zero-shot learning setting. The error bars show the standard error across n = 50 experiments and the bar centre shows the mean value. The listed p-value indicates the significance level that Prov-GigaPath outperforms the best comparison approach, with one-sided Wilcoxon test. e, Scatter plots comparing Prov-GigaPath and MI-Zero on cancer subtyping prediction and mutation prediction in terms of balanced accuracy (BACC).

References

    1. Campanella G, et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 2019;25:1301–1309. doi: 10.1038/s41591-019-0508-1. - DOI - PMC - PubMed
    1. Lu MY, et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 2021;5:555–570. doi: 10.1038/s41551-020-00682-w. - DOI - PMC - PubMed
    1. Song AH, et al. Artificial intelligence for digital and computational pathology. Nat. Rev. Bioeng. 2023;1:930–949. doi: 10.1038/s44222-023-00096-8. - DOI
    1. Ilse, M., Tomczak, J. & Welling, M. Attention-based deep multiple instance learning. In Proc. 35th International Conference on Machine Learning (eds Dy, J. & Krause, A.) 2127–2136 (IMLS, 2018).
    1. Ding, J. et al. Longnet: scaling transformers to 1,000,000,000 tokens. Preprint at 10.48550/arXiv.2307.02486 (2023).