. 2023 Oct:89:102918.

doi: 10.1016/j.media.2023.102918. Epub 2023 Aug 2.

Segment anything model for medical image analysis: An experimental study

Maciej A Mazurowski¹, Haoyu Dong², Hanxue Gu³, Jichen Yang³, Nicholas Konz³, Yixin Zhang³

Affiliations

¹ Department of Radiology, Duke University, Durham, NC, 27708, USA; Department of Electrical and Computer Engineering, Duke University, Durham, NC, 27708, USA; Department of Computer Science, Duke University, Durham, NC, 27708, USA; Department of Biostatistics & Bioinformatics, Duke University, Durham, NC, 27708, USA.
² Department of Electrical and Computer Engineering, Duke University, Durham, NC, 27708, USA. Electronic address: haoyu.dong151@duke.edu.
³ Department of Electrical and Computer Engineering, Duke University, Durham, NC, 27708, USA.

PMID: 37595404
PMCID: PMC10528428
DOI: 10.1016/j.media.2023.102918

Segment anything model for medical image analysis: An experimental study

Maciej A Mazurowski et al. Med Image Anal. 2023 Oct.

. 2023 Oct:89:102918.

doi: 10.1016/j.media.2023.102918. Epub 2023 Aug 2.

Authors

Maciej A Mazurowski¹, Haoyu Dong², Hanxue Gu³, Jichen Yang³, Nicholas Konz³, Yixin Zhang³

Affiliations

¹ Department of Radiology, Duke University, Durham, NC, 27708, USA; Department of Electrical and Computer Engineering, Duke University, Durham, NC, 27708, USA; Department of Computer Science, Duke University, Durham, NC, 27708, USA; Department of Biostatistics & Bioinformatics, Duke University, Durham, NC, 27708, USA.
² Department of Electrical and Computer Engineering, Duke University, Durham, NC, 27708, USA. Electronic address: haoyu.dong151@duke.edu.
³ Department of Electrical and Computer Engineering, Duke University, Durham, NC, 27708, USA.

PMID: 37595404
PMCID: PMC10528428
DOI: 10.1016/j.media.2023.102918

Abstract

Training segmentation models for medical images continues to be challenging due to the limited availability of data annotations. Segment Anything Model (SAM) is a foundation model trained on over 1 billion annotations, predominantly for natural images, that is intended to segment user-defined objects of interest in an interactive manner. While the model performance on natural images is impressive, medical image domains pose their own set of challenges. Here, we perform an extensive evaluation of SAM's ability to segment medical images on a collection of 19 medical imaging datasets from various modalities and anatomies. In our experiments, we generated point and box prompts for SAM using a standard method that simulates interactive segmentation. We report the following findings: (1) SAM's performance based on single prompts highly varies depending on the dataset and the task, from IoU=0.1135 for spine MRI to IoU=0.8650 for hip X-ray. (2) Segmentation performance appears to be better for well-circumscribed objects with prompts with less ambiguity such as the segmentation of organs in computed tomography and poorer in various other scenarios such as the segmentation of brain tumors. (3) SAM performs notably better with box prompts than with point prompts. (4) SAM outperforms similar methods RITM, SimpleClick, and FocalClick in almost all single-point prompt settings. (5) When multiple-point prompts are provided iteratively, SAM's performance generally improves only slightly while other methods' performance improves to the level that surpasses SAM's point-based performance. We also provide several illustrations for SAM's performance on all tested datasets, iterative segmentation, and SAM's behavior given prompt ambiguity. We conclude that SAM shows impressive zero-shot segmentation performance for certain medical imaging datasets, but moderate to poor performance for others. SAM has the potential to make a significant impact in automated medical image segmentation in medical imaging, but appropriate care needs to be applied when using it. Code for evaluation SAM is made publicly available at https://github.com/mazurowski-lab/segment-anything-medical-evaluation.

Keywords: Deep learning; Foundation models; Segmentation.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

**Fig. 1.**
Examples of prompt(s) generated by the five modes respectively. Green contours show the ground-truth masks, and blue star(s) and box(es) indicate the prompts.

**Fig. 2.**
Performance of SAM under 5 modes of use. Left: Performance of SAM across 28 segmentation tasks, with results ranked in descending order based on Mode 4. Oracle performance for each mode is indicated by the inverted triangle. Right: A summarized performance comparison of all five modes across all tasks, presented in a box and whisker plot format.

**Fig. 3.**
Visualization of SAM’s segmentation results in two different modes. Each dataset is shown in two sequential rows, with its name along the left side. For each dataset, it displays three examples from left to right, reflecting the 25th, 50th, and 75th percentiles of IoU across all images for that dataset. For each example, we visualize (top left) the raw image; (bottom left) the zoom-in image with the area of interest; (top right) the segmented results for mode 2: 1 point at each object region; (bottom right) the segmented results for mode 4: 1 box region at each object region. Additionally, the IoU is represented above each segmented result. Examples of all the datasets are shown in Appendix Figure 1–5.

**Fig. 4.**
Comparison of SAM with three other competing methods, namely RITM, SimpleClick, and Focalclick, under the 1-point prompt setting. The results are presented in the form of the difference between SAM and other methods (Δ IoU), and ranked based on the descending order of the largest Δ IoU for each task.

**Fig. 5.**
Comparison of SAM and other methods under an interactive prompt setting. (Left) it presents the average performance of SAM and other methods across all tasks with respect to the number of prompt changes. (Right) it shows the detailed performance of SAM over each task.

**Fig. 6.**
Examples of SAM’s prediction under the interactive prompt setting. For each dataset, we display the results from 1-point prompts to 9-point prompts, respectively. The positive prompts are represented as green stars, and the negative prompts are represented as red stars.

**Fig. 7.**
Visualizations of examples with ambiguity based on SAM; the 1st, 2nd, and 3rd confident predictions are shown sequentially.

**Fig. 8.**
Performance of SAM when prompts are placed randomly within certain regions.

**Fig. 9.**
(Top) the relative size of the object in each dataset; (Bottom) the object size vs. detection performance for mode 2 and mode 4 separately; we also show a regression fitted curve each.

**Fig. 10.**
Examples of segment everything mode. For each example, we sampled a different number of grid points at each side as 2⁵,2⁶ and 2⁷.

See this image and copyright information in PMC

References

1. Al-Dhabyani W, Gomaa M, Khaled H, Fahmy A, 2020. Dataset of breast ultrasound images. Data in brief 28, 104863. - PMC - PubMed
1. Anna M, Hasnin, kaggle446, shirzad, Will C., yffud, 2016. Ultrasound nerve segmentation. URL: https://kaggle.com/competitions/ultrasound-nerve-segmentation.
1. Anwar SM, Majid M, Qayyum A, Awais M, Alnowami M, Khan MK, 2018. Medical image analysis using convolutional neural networks: a review. Journal of medical systems 42, 1–13. - PubMed
1. Bilic P, Christ P, Li HB, Vorontsov E, Ben-Cohen A, Kaissis G, Szeskin A, Jacobs C, Mamani GEH, Chartrand G, et al. , 2023. The liver tumor segmentation benchmark (lits). Medical Image Analysis 84, 102680. Bradski G., 2000. The OpenCV Library. Dr. Dobb’s Journal of Software Tools - PMC - PubMed
1. Chen J, Geng Y, Chen Z, Horrocks I, Pan JZ, Chen H, 2021. Knowledge-aware zero-shot learning: Survey and perspective, in: International Joint Conference on Artificial Intelligence.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Elsevier Science
- PubMed Central
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Segment anything model for medical image analysis: An experimental study

Affiliations

Segment anything model for medical image analysis: An experimental study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous