AI-assisted anatomical structure recognition and segmentation via mamba-transformer architecture in abdominal ultrasound images
- PMID: 40771938
- PMCID: PMC12325247
- DOI: 10.3389/frai.2025.1618607
AI-assisted anatomical structure recognition and segmentation via mamba-transformer architecture in abdominal ultrasound images
Abstract
Background: Abdominal ultrasonography is a primary diagnostic tool for evaluating medical conditions within the abdominal cavity. Accurate determination of the relative locations of intra-abdominal organs and lesions based on anatomical features in ultrasound images is essential in diagnostic sonography. Recognizing and extracting anatomical landmarks facilitates lesion evaluation and enhances diagnostic interpretation. Recent artificial intelligence (AI) segmentation methods employing deep neural networks (DNNs) and transformers encounter computational efficiency challenges to balance the preservation of feature dependencies information with model efficiency, limiting their clinical applicability.
Methods: The anatomical structure recognition framework, MaskHybrid, was developed using a private dataset comprising 34,711 abdominal ultrasound images of 2,063 patients from CSMUH. The dataset included abdominal organs and vascular structures (hepatic vein, inferior vena cava, portal vein, gallbladder, kidney, pancreas, spleen) and liver lesions (hepatic cyst, tumor). MaskHybrid adopted a mamba-transformer hybrid architecture consisting of an evolved backbone network, encoder, and corresponding decoder to capture long-range spatial dependencies and contextual information effectively, demonstrating improved image segmentation capabilities in visual tasks while mitigating the computational burden associated with the transformer-based attention mechanism.
Results: Experiments on the retrospective dataset achieved a mean average precision (mAP) score of 74.13% for anatomical landmarks segmentation in abdominal ultrasound images. Our proposed framework outperformed baselines across most organ and lesion types and effectively segmented challenging anatomical structures. Moreover, MaskHybrid exhibited a significantly shorter inference time (0.120 ± 0.013 s), achieving 2.5 times faster than large-sized AI models of similar size. Combining Mamba and transformer architectures, this hybrid design was well-suited for the timely analysis of complex anatomical structures segmentation in abdominal ultrasonography, where accuracy and efficiency are critical in clinical practice.
Conclusion: The proposed mamba-transformer hybrid recognition framework simultaneously detects and segments multiple abdominal organs and lesions in ultrasound images, achieving superior segmentation accuracy, visualization effect, and inference efficiency, thereby facilitating improved medical image interpretation and near real-time diagnostic sonography that meets clinical needs.
Keywords: abdominal ultrasound; anatomical structure; artificial intelligence; deep learning; image segmentation; sonography; state space models; transformer.
Copyright © 2025 Chang, Wu, Tsai, Tseng and Wang.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures






Similar articles
-
CLT-MambaSeg: An integrated model of Convolution, Linear Transformer and Multiscale Mamba for medical image segmentation.Comput Biol Med. 2025 Sep;196(Pt B):110736. doi: 10.1016/j.compbiomed.2025.110736. Epub 2025 Jul 26. Comput Biol Med. 2025. PMID: 40716230
-
Transformers for Neuroimage Segmentation: Scoping Review.J Med Internet Res. 2025 Jan 29;27:e57723. doi: 10.2196/57723. J Med Internet Res. 2025. PMID: 39879621 Free PMC article.
-
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142. Br J Dermatol. 2024. PMID: 38581445
-
DEMAC-Net: A Dual-Encoder Multiattention Collaborative Network for Cervical Nerve Pathway and Adjacent Anatomical Structure Segmentation.Ultrasound Med Biol. 2025 Aug;51(8):1227-1239. doi: 10.1016/j.ultrasmedbio.2025.04.006. Epub 2025 May 13. Ultrasound Med Biol. 2025. PMID: 40368703
-
Brain tumor segmentation with deep learning: Current approaches and future perspectives.J Neurosci Methods. 2025 Jun;418:110424. doi: 10.1016/j.jneumeth.2025.110424. Epub 2025 Mar 21. J Neurosci Methods. 2025. PMID: 40122469
References
-
- Boesch G.. (2024). YOLO explained: From v1 to v11. Available online at: viso.ai. https://viso.ai/computer-vision/yolo-explained/ (Accessed April 20, 2025).
-
- Carion N., Massa F., Synnaeve G., Usunier N., Kirillov A., Zagoruyko S. (2020). “End-to-end object detection with transformers.” In European conference on computer vision–ECCV 2020. Lecture Notes in Computer Science 12346, 213–229.
LinkOut - more resources
Full Text Sources