Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Nov;20(11):2231-2239.
doi: 10.1007/s11548-025-03406-0. Epub 2025 May 22.

Evaluating the generalizability of video-based assessment of intraoperative surgical skill in capsulorhexis

Affiliations

Evaluating the generalizability of video-based assessment of intraoperative surgical skill in capsulorhexis

Zhiwei Gong et al. Int J Comput Assist Radiol Surg. 2025 Nov.

Abstract

Purpose: Assessment of intraoperative surgical skill is necessary to train surgeons and certify them for practice. The generalizability of deep learning models for video-based assessment (VBA) of surgical skill has not yet been evaluated. In this work, we evaluated one unsupervised domain adaptation (UDA) and three semi-supervised (SSDA) methods for generalizability of models for VBA of surgical skill in capsulorhexis by training on one dataset and testing on another.

Methods: We used two datasets, D99 and Cataract-101 (publicly available), and two state-of-the-art models for capsulorhexis. The models include a convolutional neural network (CNN) to extract features from video images, followed by a long short-term memory (LSTM) network or a transformer. We augmented the CNN and the LSTM with attention modules. We estimated accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC).

Results: Maximum mean discrepancy (MMD) did not improve generalizability of CNN-LSTM but slightly improved CNN transformer. Among the SSDA methods, Group Distributionally Robust Supervised Learning improved generalizability in most cases.

Conclusion: Model performance improved with the domain adaptation methods we evaluated, but it fell short of within-dataset performance. Our results provide benchmarks on a public dataset for others to compare their methods.

Keywords: Cataract surgery; Domain adaptation; Surgical skill assessment; Transformer.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

References

    1. Birkmeyer JD, Finks JF, O’reilly A, Oerline M, Carlin AM, Nunn AR, Dimick J, Banerjee M, Birkmeyer NJ (2013) Surgical skill and complication rates after bariatric surgery. N Engl J Med 369(15):1434–1442 - DOI - PubMed
    1. Curtis NJ, Foster JD, Miskovic D, Brown CS, Hewett PJ, Abbott S, Hanna GB, Stevenson AR, Francis NK (2020) Association of surgical skill assessment with clinical outcomes in cancer surgery. JAMA Surg 155(7):590–598 - DOI - PubMed - PMC
    1. Pryor AD, Lendvay T, Jones A, Ibáñez B, Pugh C (2023) An American board of surgery pilot of video assessment of surgeon technical performance in surgery. Ann Surg 277(4):591–595 - DOI - PubMed
    1. Feldman LS, Pryor AD, Gardner AK, Dunkin BJ, Schultz L, Awad MM, Ritter EM (2020) Sages video-based assessment (VBA) program: a vision for life-long learning for surgeons. Surg Endosc 34:3285–3288 - DOI - PubMed
    1. Hira S, Singh D, Kim TS, Gupta S, Hager G, Sikder S, Vedula SS (2022) Video-based assessment of intraoperative surgical skill. Int J Comput Assist Radiol Surg 17(10):1801–1811 - DOI - PubMed - PMC

LinkOut - more resources