Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation

Peilun Shi¹, Jianing Qiu^{1

2}, Sai Mu Dalike Abaxi¹, Hao Wei¹, Frank P-W Lo³, Wu Yuan¹

Affiliations

¹ Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China.
² Department of Computing, Imperial College London, London SW7 2AZ, UK.
³ Hamlyn Centre, Department of Surgery and Cancer, Imperial College London, London SW7 2AZ, UK.

PMID: 37296799
PMCID: PMC10252742
DOI: 10.3390/diagnostics13111947

Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation

Peilun Shi et al. Diagnostics (Basel). 2023.

. 2023 Jun 2;13(11):1947.

doi: 10.3390/diagnostics13111947.

Authors

Peilun Shi¹, Jianing Qiu^{1

2}, Sai Mu Dalike Abaxi¹, Hao Wei¹, Frank P-W Lo³, Wu Yuan¹

Affiliations

¹ Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China.
² Department of Computing, Imperial College London, London SW7 2AZ, UK.
³ Hamlyn Centre, Department of Surgery and Cancer, Imperial College London, London SW7 2AZ, UK.

PMID: 37296799
PMCID: PMC10252742
DOI: 10.3390/diagnostics13111947

Abstract

Medical image analysis plays an important role in clinical diagnosis. In this paper, we examine the recent Segment Anything Model (SAM) on medical images, and report both quantitative and qualitative zero-shot segmentation results on nine medical image segmentation benchmarks, covering various imaging modalities, such as optical coherence tomography (OCT), magnetic resonance imaging (MRI), and computed tomography (CT), as well as different applications including dermatology, ophthalmology, and radiology. Those benchmarks are representative and commonly used in model development. Our experimental results indicate that while SAM presents remarkable segmentation performance on images from the general domain, its zero-shot segmentation ability remains restricted for out-of-distribution images, e.g., medical images. In addition, SAM exhibits inconsistent zero-shot segmentation performance across different unseen medical domains. For certain structured targets, e.g., blood vessels, the zero-shot segmentation of SAM completely failed. In contrast, a simple fine-tuning of it with a small amount of data could lead to remarkable improvement of the segmentation quality, showing the great potential and feasibility of using fine-tuned SAM to achieve accurate medical image segmentation for a precision diagnostics. Our study indicates the versatility of generalist vision foundation models on medical imaging, and their great potential to achieve desired performance through fine-turning and eventually address the challenges associated with accessing large and diverse medical datasets in support of clinical diagnostics.

Keywords: Segment Anything Model (SAM); deep Learning; foundation models; large AI models; medical image segmentation; zero-shot segmentation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Successful segmentation examples of SAM. Eight distinct modalities labeled with (A–H) are included, corresponding to dermoscope, fundus, CT, MRI, RGB endoscope, X-ray, endoscopic OCT, and ophthalmic OCT. Each set of images comprises four images, containing two pairs of SAM segmentation versus corresponding ground truth (GT).

**Figure 2**
Failure segmentation examples of SAM. Eight distinct modalities labeled with (A–H) correspond to dermoscope, fundus, CT, MRI, RGB endoscope, X-ray, endoscopic OCT, and ophthalmic OCT. Each set of images consists of two paired SAM segmentation and ground truth (GT).

**Figure 3**
A failure sample of SAM on segmenting retinal vessels. The first row from left to right is: the initial input image, ground truth mask, and the input image superimposed with the ground truth mask. The second row from left to right shows three SAM segmented images with the score of 1.007, 0.993, and 0.673, respectively.

**Figure 4**
Segmentation samples of SAM fine-tuned on retinal vessels. Each row from left to right is the initial input image, ground truth mask and prediction of fine-tuned SAM. The column from top to bottom shows retinal images from four different datasets.

See this image and copyright information in PMC

Cited by

Improved Generalizability in Medical Computer Vision: Hyperbolic Deep Learning in Multi-Modality Neuroimaging.
Ayubcha C, Sajed S, Omara C, Veldman AB, Singh SB, Lokesha YU, Liu A, Aziz-Sultan MA, Smith TR, Beam A. Ayubcha C, et al. J Imaging. 2024 Dec 12;10(12):319. doi: 10.3390/jimaging10120319. J Imaging. 2024. PMID: 39728216 Free PMC article.
MRI radiomics-based decision support tool for a personalized classification of cervical disc degeneration: a two-center study.
Xie J, Yang Y, Jiang Z, Zhang K, Zhang X, Lin Y, Shen Y, Jia X, Liu H, Yang S, Jiang Y, Ma L. Xie J, et al. Front Physiol. 2024 Jan 3;14:1281506. doi: 10.3389/fphys.2023.1281506. eCollection 2023. Front Physiol. 2024. PMID: 38235385 Free PMC article.
Technical note: Generalizable and promptable artificial intelligence model to augment clinical delineation in radiation oncology.
Zhang L, Liu Z, Zhang L, Wu Z, Yu X, Holmes J, Feng H, Dai H, Li X, Li Q, Wong WW, Vora SA, Zhu D, Liu T, Liu W. Zhang L, et al. Med Phys. 2024 Mar;51(3):2187-2199. doi: 10.1002/mp.16965. Epub 2024 Feb 6. Med Phys. 2024. PMID: 38319676 Free PMC article.
Enhancing Meibography Image Analysis Through Artificial Intelligence-Driven Quantification and Standardization for Dry Eye Research.
Yeh CH, Graham AD, Yu SX, Lin MC. Yeh CH, et al. Transl Vis Sci Technol. 2024 Jun 3;13(6):16. doi: 10.1167/tvst.13.6.16. Transl Vis Sci Technol. 2024. PMID: 38904611 Free PMC article.
Enhancing Microdroplet Image Analysis with Deep Learning.
Gelado SH, Quilodrán-Casas C, Chagot L. Gelado SH, et al. Micromachines (Basel). 2023 Oct 22;14(10):1964. doi: 10.3390/mi14101964. Micromachines (Basel). 2023. PMID: 37893401 Free PMC article.

See all "Cited by" articles

References

1. Bommasani R., Hudson D.A., Adeli E., Altman R., Arora S., von Arx S., Bernstein M.S., Bohg J., Bosselut A., Brunskill E., et al. On the opportunities and risks of foundation models. arXiv. 20212108.07258
1. Mattjie C., de Moura L.V., Ravazio R.C., Kupssinskü L.S., Parraga O., Delucis M.M., Barros R.C. Exploring the zero-shot capabilities of the segment anything model (sam) in 2d medical imaging: A comprehensive evaluation and practical guideline. arXiv. 20232305.00109
1. Qiu J., Li L., Sun J., Peng J., Shi P., Zhang R., Dong Y., Lam K., Lo F.P.W., Xiao B., et al. Large AI Models in Health Informatics: Applications, Challenges, and the Future. arXiv. 20232303.11568 - PubMed
1. Kirillov A., Mintun E., Ravi N., Mao H., Rolland C., Gustafson L., Xiao T., Whitehead S., Berg A.C., Lo W.Y., et al. Segment Anything. arXiv. 20232304.02643
1. Deng R., Cui C., Liu Q., Yao T., Remedios L.W., Bao S., Landman B.A., Wheless L.E., Coburn L.A., Wilson K.T., et al. Segment Anything Model (SAM) for Digital Pathology: Assess Zero-shot Segmentation on Whole Slide Imaging. arXiv. 20232304.04155

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation

Affiliations

Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources