Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 20;12(1):825.
doi: 10.1038/s41597-025-05163-w.

CholecInstanceSeg: A Tool Instance Segmentation Dataset for Laparoscopic Surgery

Affiliations

CholecInstanceSeg: A Tool Instance Segmentation Dataset for Laparoscopic Surgery

Oluwatosin Alabi et al. Sci Data. .

Abstract

In laparoscopic and robotic surgery, precise tool instance segmentation is an essential technology for advanced computer-assisted interventions. Although publicly available procedures of routine surgeries exist, they often lack comprehensive annotations for tool instance segmentation. Additionally, the majority of standard datasets for tool segmentation are derived from porcine(pig) surgeries. To address this gap, we introduce CholecInstanceSeg, the largest open-access tool instance segmentation dataset to date. Derived from the existing CholecT50 and Cholec80 datasets, CholecInstanceSeg provides novel annotations for laparoscopic cholecystectomy procedures in patients. Our dataset comprises 41.9k annotated frames extracted from 85 clinical procedures and 64.4k tool instances, each labelled with semantic masks and instance IDs. To ensure the reliability of our annotations, we perform extensive quality control, conduct label agreement statistics, and benchmark the segmentation results with various instance segmentation baselines. CholecInstanceSeg aims to advance the field by offering a comprehensive and high-quality open-access dataset for the development and evaluation of tool instance segmentation algorithms.

PubMed Disclaimer

Conflict of interest statement

Competing interests: TV is a co-founder and shareholder of Hypervision Surgical Ltd, London, UK. The authors declare that they have no other conflict of interest.

Figures

Fig. 1
Fig. 1
CholecInstanceSeg contains frames extracted from CholecT50, CholecSeg8k, and Cholec80. These three datasets are sub-datasets of the CAMMA in-house dataset called Cholec120. There are also some shared image sequences (referred to as seqs in the figure for brevity) between CholecT50 and CholecSeg8k.
Fig. 2
Fig. 2
CholecInstanceSeg is composed of frames from four dataset partitions: Instance-CholecT50-sparse, Instance-CholecT50-full, Instance-CholecSeg8k, and Instance-Cholec80-sparse. The diagram illustrates the number of image sequences (denoted seqs in the figure for brevity) and extraction rates for each partition. Note that 10 sequences are shared between Instance-CholecT50-full and Instance-CholecSeg8k.
Fig. 3
Fig. 3
Semi-automatic annotation pipeline. The process begins with initial annotations using Instance-CholecT50-sparse and Instance-CholecSeg8k, followed by model training. The trained model generates labels, which are then reviewed and corrected by human annotators. The corrected annotations augment the training dataset, iterating through the cycle until the final Instance-CholecT50-full dataset is produced.
Fig. 4
Fig. 4
(a) Dataset directory structure for CholecInstanceSeg. Each split (e.g. train) contains subdirectories for each sequence, with further subdirectories for annotations (ann_dir). (b) Sample JSON annotation file showing the structure of the annotation data, including class labels (“label”), polygon information (“points”) and instance IDs (“group_id”).
Fig. 5
Fig. 5
(a) Distribution of each tool class across the dataset. (b) Distribution of the number of tools per frame.
Fig. 6
Fig. 6
Distribution of tool instances across sequences and partitions in CholecInstanceSeg.
Fig. 7
Fig. 7
Recommended Dataset Splits for CholecInstanceSeg. The diagram shows the distribution of sequences across the training, validation, and testing datasets, highlighting the specific partitions (Instance-CholecT50-sparse, Instance-CholecSeg8k, Instance-CholecT50-full, Instance-Cholec80-sparse) and their corresponding sequences.

References

    1. Fuchs, K. Minimally invasive surgery. Endoscopy34, 154–159 (2002). - PubMed
    1. Islam, M., Atputharuban, D. A., Ramesh, R. & Ren, H. Real-time instrument segmentation in robotic surgery using auxiliary supervised deep adversarial learning. IEEE Robotics and Automation Letters4, 2188–2195 (2019).
    1. Vercauteren, T., Unberath, M., Padoy, N. & Navab, N. CAI4CAI: The rise of contextual artificial intelligence in computer-assisted interventions. Proceedings of the IEEE108, 198–214 (2020). - PMC - PubMed
    1. Ward, T. M. et al. Computer vision in surgery. Surgery169, 1253–1256 (2021). - PubMed
    1. Bodenstedt, S. et al. Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery (2018).

LinkOut - more resources