Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 22;12(1):12575.
doi: 10.1038/s41598-022-16923-8.

Limited generalizability of single deep neural network for surgical instrument segmentation in different surgical environments

Affiliations

Limited generalizability of single deep neural network for surgical instrument segmentation in different surgical environments

Daichi Kitaguchi et al. Sci Rep. .

Abstract

Clarifying the generalizability of deep-learning-based surgical-instrument segmentation networks in diverse surgical environments is important in recognizing the challenges of overfitting in surgical-device development. This study comprehensively evaluated deep neural network generalizability for surgical instrument segmentation using 5238 images randomly extracted from 128 intraoperative videos. The video dataset contained 112 laparoscopic colorectal resection, 5 laparoscopic distal gastrectomy, 5 laparoscopic cholecystectomy, and 6 laparoscopic partial hepatectomy cases. Deep-learning-based surgical-instrument segmentation was performed for test sets with (1) the same conditions as the training set; (2) the same recognition target surgical instrument and surgery type but different laparoscopic recording systems; (3) the same laparoscopic recording system and surgery type but slightly different recognition target laparoscopic surgical forceps; (4) the same laparoscopic recording system and recognition target surgical instrument but different surgery types. The mean average precision and mean intersection over union for test sets 1, 2, 3, and 4 were 0.941 and 0.887, 0.866 and 0.671, 0.772 and 0.676, and 0.588 and 0.395, respectively. Therefore, the recognition accuracy decreased even under slightly different conditions. The results of this study reveal the limited generalizability of deep neural networks in the field of surgical artificial intelligence and caution against deep-learning-based biased datasets and models.Trial Registration Number: 2020-315, date of registration: October 5, 2020.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Representative images of recognition target surgical instruments in this study. (A) Surgical instruments contained in the training set (T1: harmonic shears; T2: endoscopic surgical electrocautery; T3: Aesculap AdTec atraumatic universal forceps). (B) Laparoscopic surgical forceps not contained in the training set (T4: Maryland; T5: Croce-Olmi; T6: needle holder).
Figure 2
Figure 2
Surgical-instrument recognition-accuracy results (AP average precision, IoU intersection over union, mAP mean average precision, mIoU mean intersection over union). (A) AP and IoU under the same condition as the training set (T1: harmonic shears; T2: endoscopic surgical electrocautery; T3: Aesculap AdTec atraumatic universal forceps). (B) mAP and mIoU for different types of laparoscopic recording systems. (C) AP and IoU for different types of laparoscopic surgical forceps (T3: Aesculap AdTec atraumatic universal forceps; T4: Maryland; T5: Croce-Olmi; T6: needle holder). (D) mAP and mIoU for different types of surgery (LCRR laparoscopic colorectal resection, LDG laparoscopic distal gastrectomy, LC laparoscopic cholecystectomy, LPH laparoscopic partial hepatectomy).
Figure 3
Figure 3
Representative images recorded by each laparoscopic recording system. (A) Endoeye laparoscope (Olympus Co., Ltd., Tokyo, Japan) and Visera Elite II system (Olympus Co., Ltd, Tokyo, Japan). (B) 1488 HD 3-Chip camera system (Stryker Corp., Kalamazoo, MI, USA). (C) Image 1 S camera system (Karl Storz SE & Co., KG, Tuttlingen, Germany).
Figure 4
Figure 4
Representative images of each type of surgery. (A) LCRR; (B) LDG; (C) LC; (D) LPH.

Similar articles

Cited by

References

    1. Siddaiah-Subramanya M, Tiang KW, Nyandowe M. A new era of minimally invasive surgery: Progress and development of major technical innovations in general surgery over the last decade. Surg. J. (N Y) 2017;3:e163–e166. doi: 10.1055/s-0037-1608651. - DOI - PMC - PubMed
    1. Maier-Hein L, et al. Surgical data science for next-generation interventions. Nat. Biomed. Eng. 2017;1:691–696. doi: 10.1038/s41551-017-0132-7. - DOI - PubMed
    1. Hashimoto DA, Rosman G, Rus D, Meireles OR. Artificial intelligence in surgery: Promises and perils. Ann. Surg. 2018;268:70–76. doi: 10.1097/SLA.0000000000002693. - DOI - PMC - PubMed
    1. Mori Y, et al. Real-time use of artificial intelligence in identification of diminutive polyps during colonoscopy: A prospective study. Ann. Intern. Med. 2018;169:357–366. doi: 10.7326/M18-0249. - DOI - PubMed
    1. Li C, et al. Development and validation of an endoscopic images-based deep learning model for detection with nasopharyngeal malignancies. Cancer Commun. (Lond.) 2018;38:59. doi: 10.1186/s40880-018-0325-9. - DOI - PMC - PubMed