Ultrasam: a foundation model for ultrasound using large open-access segmentation datasets
- PMID: 40932576
- DOI: 10.1007/s11548-025-03517-8
Ultrasam: a foundation model for ultrasound using large open-access segmentation datasets
Abstract
Purpose: Automated ultrasound (US) image analysis remains a longstanding challenge due to anatomical complexity and the scarcity of annotated data. Although large-scale pretraining has improved data efficiency in many visual domains, its impact in US is limited by a pronounced domain shift from other imaging modalities and high variability across clinical applications, such as chest, ovarian, and endoscopic imaging. To address this, we propose UltraSam, a SAM-style model trained on a heterogeneous collection of publicly available segmentation datasets, originally developed in isolation. UltraSam is trained under the prompt-conditioned segmentation paradigm, which eliminates the need for unified labels and enables generalization to a broad range of downstream tasks.
Methods: We compile US-43d, a large-scale collection of 43 open-access US datasets comprising over 282,000 images with segmentation masks covering 58 anatomical structures. We explore adaptation and fine-tuning strategies for SAM and systematically evaluate transferability across downstream tasks, comparing against state-of-the-art pretraining methods. We further propose prompted classification, a new use case where object-specific prompts and image features are jointly decoded to improve classification performance.
Results: In experiments on three diverse public US datasets, UltraSam outperforms existing SAM variants on prompt-based segmentation and surpasses self-supervised US foundation models on downstream (prompted) classification and instance segmentation tasks.
Conclusion: UltraSam demonstrates that SAM-style training on diverse, sparsely annotated US data enables effective generalization across tasks. By unlocking the value of fragmented public datasets, our approach lays the foundation for scalable, real-world US representation learning. We release our code and pretrained models at https://github.com/CAMMA-public/UltraSam and invite the community to further this effort by continuing to contribute high-quality datasets.
Keywords: Foundation models; Large-scale dataset; SAM; Ultrasound.
© 2025. CARS.
Conflict of interest statement
Declarations. Conflict of interest: The authors declare that they have no conflict of interest. Consent to Participate: No informed consent was required as the study did not involve human or animal participants.
References
-
- Tyagi A, Tyagi A, Kaur M, Aggarwal R, Soni KD, Sivaswamy J, Trikha A (2024) Nerve block target localization and needle guidance for autonomous robotic ultrasound guided regional anesthesia. In: 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5867–5872. https://doi.org/10.1109/IROS58592.2024.10801467
-
- Zhao Q, Lyu S, Bai W, Cai L, Liu B, Wu M, Sang X, Yang M, Chen L (2022) A multi-modality ovarian tumor ultrasound image dataset for unsupervised cross-domain semantic segmentation. CoRR arxiv:2207.06799
-
- Kang Q, Gao J, Li K, Lao Q (2023) Deblurring masked autoencoder is better recipe for ultrasound image recognition. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 352–362. Springer
-
- Lin X, Xiang Y, Yu L, Yan Z (2024) Beyond adapting sam: Towards end-to-end ultrasound image segmentation via auto prompting. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 24–34. Springer
MeSH terms
LinkOut - more resources
Full Text Sources
