Fast and scalable search of whole-slide images via self-supervised deep learning
- PMID: 36217022
- PMCID: PMC9792371
- DOI: 10.1038/s41551-022-00929-8
Fast and scalable search of whole-slide images via self-supervised deep learning
Abstract
The adoption of digital pathology has enabled the curation of large repositories of gigapixel whole-slide images (WSIs). Computationally identifying WSIs with similar morphologic features within large repositories without requiring supervised training can have significant applications. However, the retrieval speeds of algorithms for searching similar WSIs often scale with the repository size, which limits their clinical and research potential. Here we show that self-supervised deep learning can be leveraged to search for and retrieve WSIs at speeds that are independent of repository size. The algorithm, which we named SISH (for self-supervised image search for histology) and provide as an open-source package, requires only slide-level annotations for training, encodes WSIs into meaningful discrete latent representations and leverages a tree data structure for fast searching followed by an uncertainty-based ranking algorithm for WSI retrieval. We evaluated SISH on multiple tasks (including retrieval tasks based on tissue-patch queries) and on datasets spanning over 22,000 patient cases and 56 disease subtypes. SISH can also be used to aid the diagnosis of rare cancer types for which the number of available WSIs is often insufficient to train supervised deep-learning models.
© 2022. The Author(s).
Conflict of interest statement
F.M. receives research support from Leidos Biomedical Research, Inc. for projects unrelated to this study. The authors declare no other competing interests.
Figures

















Similar articles
-
SAMPLER: unsupervised representations for rapid analysis of whole slide tissue images.EBioMedicine. 2024 Jan;99:104908. doi: 10.1016/j.ebiom.2023.104908. Epub 2023 Dec 14. EBioMedicine. 2024. PMID: 38101298 Free PMC article.
-
Weakly Supervised Deep Learning for Whole Slide Lung Cancer Image Analysis.IEEE Trans Cybern. 2020 Sep;50(9):3950-3962. doi: 10.1109/TCYB.2019.2935141. Epub 2019 Sep 2. IEEE Trans Cybern. 2020. PMID: 31484154
-
RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval.Med Image Anal. 2023 Jan;83:102645. doi: 10.1016/j.media.2022.102645. Epub 2022 Oct 1. Med Image Anal. 2023. PMID: 36270093
-
Deep learning for colon cancer histopathological images analysis.Comput Biol Med. 2021 Sep;136:104730. doi: 10.1016/j.compbiomed.2021.104730. Epub 2021 Aug 4. Comput Biol Med. 2021. PMID: 34375901 Review.
-
Applications of discriminative and deep learning feature extraction methods for whole slide image analysis: A survey.J Pathol Inform. 2023 Sep 14;14:100335. doi: 10.1016/j.jpi.2023.100335. eCollection 2023. J Pathol Inform. 2023. PMID: 37928897 Free PMC article. Review.
Cited by
-
Integrating spatial transcriptomics and bulk RNA-seq: predicting gene expression with enhanced resolution through graph attention networks.Brief Bioinform. 2024 May 23;25(4):bbae316. doi: 10.1093/bib/bbae316. Brief Bioinform. 2024. PMID: 38960406 Free PMC article.
-
Self-supervised learning reveals clinically relevant histomorphological patterns for therapeutic strategies in colon cancer.Nat Commun. 2025 Mar 8;16(1):2328. doi: 10.1038/s41467-025-57541-y. Nat Commun. 2025. PMID: 40057490 Free PMC article.
-
Automated Detection of Kaposi Sarcoma-Associated Herpesvirus-Infected Cells in Immunohistochemical Images of Skin Biopsies.JCO Glob Oncol. 2025 Apr;11:e2400536. doi: 10.1200/GO-24-00536. Epub 2025 Apr 16. JCO Glob Oncol. 2025. PMID: 40239145
-
Novel research and future prospects of artificial intelligence in cancer diagnosis and treatment.J Hematol Oncol. 2023 Nov 27;16(1):114. doi: 10.1186/s13045-023-01514-5. J Hematol Oncol. 2023. PMID: 38012673 Free PMC article. Review.
-
Attention2Minority: A salient instance inference-based multiple instance learning for classifying small lesions in whole slide images.Comput Biol Med. 2023 Dec;167:107607. doi: 10.1016/j.compbiomed.2023.107607. Epub 2023 Oct 20. Comput Biol Med. 2023. PMID: 37890421 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources