BMC Musculoskeletal Disorders (Aug 2024)

Clinical validation of artificial intelligence-based preoperative virtual reduction for Neer 3- or 4-part proximal humerus fractures

  • Young Dae Jeon,
  • Kwang-Hwan Jung,
  • Moo-Sub Kim,
  • Hyeonjoo Kim,
  • Do-Kun Yoon,
  • Ki-Bong Park

DOI
https://doi.org/10.1186/s12891-024-07798-z
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background If reduction images of fractures can be provided in advance with artificial-intelligence (AI)-based technology, it can assist with preoperative surgical planning. Recently, we developed the AI-based preoperative virtual reduction model for orthopedic trauma, which can provide an automatic segmentation and reduction of fractured fragments. The purpose of this study was to validate a quality of reduction model of Neer 3- or 4-part proximal humerus fractures established by AI-based technology. Methods To develop the AI-based preoperative virtual reduction model, deep learning performed the segmentation of fracture fragments, and a Monte Carlo simulation completed the virtual reduction to determine the best model. A total of 20 pre/postoperative three-dimensional computed tomography (CT) scans of proximal humerus fracture were prepared. The preoperative CT scans were employed as the input of AI-based automated reduction (AI-R) to deduce the reduction models of fracture fragments, meanwhile, the manual reduction (MR) was conducted using the same CT images. Dice similarity coefficient (DSC) and intersection over union (IoU) between the reduction model from the AI-R/MR and postoperative CT scans were evaluated. Working times were compared between the two groups. Clinical validity agreement (CVA) and reduction quality score (RQS) were investigated for clinical validation outcomes by 20 orthopedic surgeons. Results The mean DSC and IoU were better when using AI-R that when using MR (0.78 ± 0.13 vs. 0.69 ± 0.16, p < 0.001 and 0.65 ± 0.16 vs. 0.55 ± 0.18, p < 0.001, respectively). The working time of AI-R was, on average, 1.41% of that of MR. The mean CVA of all cases was 81%±14.7% (AI-R, 82.25%±14.27%; MR, 76.75%±14.17%, p = 0.06). The mean RQS was significantly higher when AI-R compared with MR was used (91.47 ± 1.12 vs. 89.30 ± 1.62, p = 0.045). Conclusion The AI-based preoperative virtual reduction model showed good performance in the reduction model in proximal humerus fractures with faster working times. Beyond diagnosis, classification, and outcome prediction, the AI-based technology can change the paradigm of preoperative surgical planning in orthopedic surgery. Level of evidence Level IV.

Keywords