Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 23;15(1):14042.
doi: 10.1038/s41598-025-94568-z.

SegMatch: semi-supervised surgical instrument segmentation

Affiliations

SegMatch: semi-supervised surgical instrument segmentation

Meng Wei et al. Sci Rep. .

Abstract

Surgical instrument segmentation is recognised as a key enabler in providing advanced surgical assistance and improving computer-assisted interventions. In this work, we propose SegMatch, a semi-supervised learning method to reduce the need for expensive annotation for laparoscopic and robotic surgical images. SegMatch builds on FixMatch, a widespread semi-supervised classification pipeline combining consistency regularization and pseudo-labelling, and adapts it for the purpose of segmentation. In our proposed SegMatch, the unlabelled images are first weakly augmented and fed to the segmentation model to generate pseudo-labels. In parallel, images are fed to a strong augmentation branch and consistency between the branches is used as an unsupervised loss. To increase the relevance of our strong augmentations, we depart from using only handcrafted augmentations and introduce a trainable adversarial augmentation strategy. Our FixMatch adaptation for segmentation tasks further includes carefully considering the equivariance and invariance properties of the augmentation functions we rely on. For binary segmentation tasks, our algorithm was evaluated on the MICCAI Instrument Segmentation Challenge datasets, Robust-MIS 2019 and EndoVis 2017. For multi-class segmentation tasks, we relied on the recent CholecInstanceSeg dataset. Our results show that SegMatch outperforms fully-supervised approaches by incorporating unlabelled data, and surpasses a range of state-of-the-art semi-supervised models across different labelled to unlabelled data ratios.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Representative sample images from Robust-MIS 2019 of laparoscopic surgery (left) and state-of-the-art instrument segmentation results (right). True positive (yellow), true negative (black), false positive (purple), and false negative (red).
Fig. 2
Fig. 2
SegMatch training process structure. The top row is the fully-supervised pathway which follows the traditional segmentation model training process. The two bottom rows form the unsupervised learning pathway, where one branch uses a weakly augmented image fed into the model to compute predictions, and the second branch obtains the model prediction via strong augmentation for the same image. The model parameters are shared across the two pathways. The hand-crafted photometric augmentation methods are used to initialize the strong augmented image, which is further perturbed by an adversarial attack (I-FGSM) for K iterations.
Fig. 3
Fig. 3
Equivariance (left) and invariance (right) properties for an image augmented under different types of augmentations: spatial (left) or photometric (right).
Algorithm 1
Algorithm 1
Workflow for weak augmentation branch
Algorithm 2
Algorithm 2
Workflow for strong augmentation branch
Fig. 4
Fig. 4
Segmentation results on exemplar images from three different procedures in the testing set. Here, SegMatch, CCT, and WSSL were trained using the whole labelled training set of Robust-MIS 2019 as a labelled set, and 17K additional unlabelled frames from the original videos. The fully supervised learning models (OR-UNet and ISINet) were trained using the whole labelled training set of Robust-MIS 2019 as a labelled set. The first column is the ground truth mask placed on top of the original image, and the other column is the segmentation results of SegMatch ablation models and state-of-the-art models. The three rows from up to bottom are the testing image samples from proctocolectomy procedures, sigmoid resection procedure (unseen type), and rectal resection procedure respectively. The yellow stars highlight the key area in better visualization..
Fig. 5
Fig. 5
Mean Dice score produced by varying the confidence threshold for pseudo-labels.
Fig. 6
Fig. 6
Optimal formula image value enhances segmentation performance (as indicated by the peak in mean Dice score).
Fig. 7
Fig. 7
Examples showcase the impact of strong augmentation transform functions and adversarial augmentation on an original unlabelled image input to a model. The first column features the original image covered by its segmentation mask output from the model, as well as the strongly augmented image obtained via initial strong augmentation and its output segmentation mask. The 2-6 columns showcase adversarial images produced by I-FGSM with varying values of formula image and K (which becomes FGSM when formula image) to replace the original strongly augmented images for model parameter updating. The upper rows in the 2-6 columns display the adversarial images, while the bottom rows show the corresponding segmentation results produced by the model.
Fig. 8
Fig. 8
Failure cases of SegMatch’s output. The first row shows the original image with the ground truth mask, while the second row overlays SegMatch predictions on the original image. Columns 1 to 4 illustrate the 1st, 2nd, 3rd, and 4th examples, respectively.

References

    1. Ross, T. et al. Robust medical instrument segmentation challenge 2019. arXiv preprintarXiv:2003.10299 (2020).
    1. Speidel, S. et al. Visual tracking of da Vinci instruments for laparoscopic surgery. In: Medical Imaging 2014: Image-Guided Procedures, Robotic Interventions, and Modeling, vol. 9036, 47–52 (SPIE, 2014).
    1. Rieke, N. et al. Real-time localization of articulated surgical instruments in retinal microsurgery. Med. Image Anal.34, 82–100 (2016). - PubMed
    1. Laina, I. et al. Concurrent segmentation and localization for tracking of surgical instruments. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, 664–672 (Springer, 2017).
    1. Hasan, S. K. & Linte, C. A. U-NetPlus: A modified encoder-decoder U-Net architecture for semantic and instance segmentation of surgical instruments from laparoscopic images. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 7205–7211 (IEEE, 2019). - PMC - PubMed

MeSH terms

LinkOut - more resources