. 2023 Mar 1;12(3):23.

doi: 10.1167/tvst.12.3.23.

PhacoTrainer: Deep Learning for Cataract Surgical Videos to Track Surgical Tools

Hsu-Hang Yeh¹, Anjal M Jain², Olivia Fox³, Kostya Sebov¹, Sophia Y Wang^{1

2}

Affiliations

¹ Department of Biomedical Data Science, Stanford University, Palo Alto, CA, USA.
² Department of Ophthalmology, Byers Eye Institute, Stanford University, Palo Alto, CA, USA.
³ Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD, USA.

PMID: 36947046
PMCID: PMC10050900
DOI: 10.1167/tvst.12.3.23

PhacoTrainer: Deep Learning for Cataract Surgical Videos to Track Surgical Tools

Hsu-Hang Yeh et al. Transl Vis Sci Technol. 2023.

. 2023 Mar 1;12(3):23.

doi: 10.1167/tvst.12.3.23.

Authors

Hsu-Hang Yeh¹, Anjal M Jain², Olivia Fox³, Kostya Sebov¹, Sophia Y Wang^{1

2}

Affiliations

¹ Department of Biomedical Data Science, Stanford University, Palo Alto, CA, USA.
² Department of Ophthalmology, Byers Eye Institute, Stanford University, Palo Alto, CA, USA.
³ Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD, USA.

PMID: 36947046
PMCID: PMC10050900
DOI: 10.1167/tvst.12.3.23

Abstract

Purpose: The purpose of this study was to build a deep-learning model that automatically analyzes cataract surgical videos for the locations of surgical landmarks, and to derive skill-related motion metrics.

Methods: The locations of the pupil, limbus, and 8 classes of surgical instruments were identified by a 2-step algorithm: (1) mask segmentation and (2) landmark identification from the masks. To perform mask segmentation, we trained the YOLACT model on 1156 frames sampled from 268 videos and the public Cataract Dataset for Image Segmentation (CaDIS) dataset. Landmark identification was performed by fitting ellipses or lines to the contours of the masks and deriving locations of interest, including surgical tooltips and the pupil center. Landmark identification was evaluated by the distance between the predicted and true positions in 5853 frames of 10 phacoemulsification video clips. We derived the total path length, maximal speed, and covered area using the tip positions and examined the correlation with human-rated surgical performance.

Results: The mean average precision score and intersection-over-union for mask detection were 0.78 and 0.82. The average distance between the predicted and true positions of the pupil center, phaco tip, and second instrument tip was 5.8, 9.1, and 17.1 pixels. The total path length and covered areas of these landmarks were negatively correlated with surgical performance.

Conclusions: We developed a deep-learning method to localize key anatomical portions of the eye and cataract surgical tools, which can be used to automatically derive metrics correlated with surgical skill.

Translational relevance: Our system could form the basis of an automated feedback system that helps cataract surgeons evaluate their performance.

PubMed Disclaimer

Conflict of interest statement

Disclosure: H.-H. Yeh, None; A.M. Jain, None; O. Fox, None; K. Sebov, None; S.Y. Wang, None

Figures

**Figure 1.**
Overview of two-step deep learning and computer vision algorithm for identification of eye anatomy and surgical tools and their landmarks. Surgical video frames are input into the trained YOLACT model, which generated masks for surgical instruments and pupils. Contours of the masks are identified and either ellipses or lines are fitted according to the type of object. The pupil centers, the tooltips and the orientation of surgical tools can be determined from the mask contours. Combining the information across sequential frames, performance-related motion metrics can be automatically generated.

**Figure 2.**
**Examples of predicted segmentation masks, tip positions for cataract surgical tools, and center of pupils.** The *yellow* regions represent masks predicted by YOLACT model for the object class. *Red crosses* indicate the point localized by the landmark identification algorithm.

**Figure 3.**
**An example of predicted trajectory and the true trajectory of the phacoemulsification probe tip.** *Green* and *red* lines indicate the predicted and true trajectory of the phacoemulsification probe tip from a randomly selected 50-second clip from the test videos. Coordinates are plotted every 0.5 seconds and lighter colors represent earlier frames.

See this image and copyright information in PMC

References

1. Hashemi H, Pakzad R, Yekta A, et al.. Global and regional prevalence of age-related cataract: A comprehensive systematic review and meta-analysis. Eye. 2020; 34: 1357–1370. - PMC - PubMed
1. Cullen KA, Hall MJ, Golosinskiy A.. Ambulatory surgery in the United States, 2006. Natl Health Stat Report. 2009; 11: 1–25. - PubMed
1. Terveen D, Berdahl J, Dhariwal M, Meng Q.. Real-world cataract surgery complications and secondary interventions incidence rates: An analysis of US medicare claims database. J Ophthalmol. 2022; 2022: 8653476. - PMC - PubMed
1. McDonnell PJ, Kirwan TJ, Brinton GS, et al.. Perceptions of recent ophthalmology residency graduates regarding preparation for practice. Ophthalmology. 2007; 114: 387–391. - PubMed
1. Funke I, Mees ST, Weitz J, Speidel S.. Video-based surgical skill assessment using 3D convolutional neural networks. International Journal of Computer Assisted Radiology and Surgery. 2019; 14: 1217–1225. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

PhacoTrainer: Deep Learning for Cataract Surgical Videos to Track Surgical Tools

Affiliations

PhacoTrainer: Deep Learning for Cataract Surgical Videos to Track Surgical Tools

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical