Clinical Ophthalmology (Mar 2024)
Comparison of Machine and Human Expert Evaluation of Capsulorrhexis Creation Performance Through Analysis of Surgical Video Recordings
Abstract
Anvesh Annadanam,1 Ethan Kahana,1 Chris Andrews,1 Alexa R Thibodeau,1 Shahzad I Mian,1 Bradford L Tannen,1 Nambi Nallasamy1,2 1Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA; 2Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USACorrespondence: Nambi Nallasamy, Department of Ophthalmology and Visual Sciences, University of Michigan, 1000 Wall Street, Ann Arbor, MI, 48105, USA, Tel +1 734-763-8122, Email [email protected]: Achieving competency in cataract surgery is an essential component of ophthalmology residency training. Video-based analysis of surgery can change training through its objective, reliable, and timely assessment of resident performance.Methods: Using the Image Labeler application in MATLAB, the capsulorrhexis step of 208 surgical videos, recorded at the University of Michigan, was annotated for subjective and objective analysis. Two expert surgeons graded the creation of the capsulorrhexis based on the International Council of Ophthalmology’s Ophthalmology Surgical Competency Assessment Rubric:Phacoemulsification (ICO-OSCAR:phaco) rating scale and a custom rubric (eccentricity, roundness, size, centration) that focuses on the objective aspects of this step. The annotated rhexis frames were run through an automated analysis to obtain objective scores for these components. The subjective scores were compared using both intra and inter-rater analyses to assess the consistency of a human-graded scale. The subjective and objective scores were compared using intraclass correlation methods to determine relative agreement.Results: All rhexes were graded as 4/5 or 5/5 by both raters for both items 4 and 5 of the ICO-OSCAR:phaco rating scale. Only roundness scores were statistically different between the subjective graders (mean difference = − 0.149, p-value = 0.0023). Subjective scores were highly correlated for all components (> 0.6). Correlations between objective and subjective scores were low (0.09 to 0.39).Conclusion: Video-based analysis of cataract surgery presents significant opportunities, including the ability to asynchronously evaluate performance and provide longitudinal assessment. Subjective scoring between two raters was moderately correlated for each component.Keywords: capsulorrhexis, artificial intelligence, surgical training, cataract surgery