A whole-slide foundation model for digital pathology from real-world data
- PMID: 38778098
- PMCID: PMC11153137
- DOI: 10.1038/s41586-024-07441-w
A whole-slide foundation model for digital pathology from real-world data
Abstract
Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles1-3. Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context4. Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet5 method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data6. With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision-language pretraining for pathology7,8 by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling.
© 2024. The Author(s).
Conflict of interest statement
C.B. is a member of the scientific advisory board and owns stock in PrimeVax and BioAI; is on the scientific board of Lunaphore and SironaDx; has a consultant or advisory relationship with Sanofi, Agilent, Roche and Incendia; contributes to institutional research for Illumina, and is an inventor on US patent applications US20180322632A1 (
Figures
References
-
- Song AH, et al. Artificial intelligence for digital and computational pathology. Nat. Rev. Bioeng. 2023;1:930–949. doi: 10.1038/s44222-023-00096-8. - DOI
-
- Ilse, M., Tomczak, J. & Welling, M. Attention-based deep multiple instance learning. In Proc. 35th International Conference on Machine Learning (eds Dy, J. & Krause, A.) 2127–2136 (IMLS, 2018).
-
- Ding, J. et al. Longnet: scaling transformers to 1,000,000,000 tokens. Preprint at 10.48550/arXiv.2307.02486 (2023).
MeSH terms
LinkOut - more resources
Full Text Sources
