Foot & Ankle Orthopaedics (Dec 2024)

Enhancing Fracture Detection in Remote Settings: Evaluating the Efficacy of FIXUS AI Deep Learning Algorithms in Identifying Fifth Metatarsal Fractures Using Mixed-Quality X-rays

  • Atta Taseh MD,
  • Alireza Gholipour PhD,
  • Mani Eftekhari,
  • Alireza Ebrahimi MD,
  • Alexandra F. Flaherty MD, MS,
  • Alexandra F. Flaherty MD, MS,
  • Sumner Jones,
  • Varun Nukala,
  • Gregory R. Waryasz MD,
  • Daniel Guss MD, MBA,
  • John Y. Kwon MD,
  • Christopher W. DiGiovanni MD,
  • Lorena Bejarano-Pineda MD,
  • Soheil Ashkani-Esfahani MD

DOI
https://doi.org/10.1177/2473011424S00269
Journal volume & issue
Vol. 9

Abstract

Read online

Category: Midfoot/Forefoot; Trauma Introduction/Purpose: The diagnosis of fractures can be challenging in specific medical settings due to limited expertise or time. While deep learning has shown promising results, its use is confined to the quality of images and the hassle of importing images to the models. This study aims to develop a model to detect fifth metatarsal fractures based on cell phone photos of radiographs directly taken from a regular screen. Methods: retrospective case-control study was conducted including patients aged > 18 years with fifth metatarsal fractures (n=1240) (Fx), and healthy controls (n=1224) (NoF). Three view radiographs (anterior, posterior, and lateral) were obtained from the Electronic Health Record (EHR) in PNG format. To generate a mixed-quality dataset, Android and iOS smartphones (SP) were used to create two separate datasets for each Fx and NoF group. Two separate deep learning models on each EHR, SP, and combined datasets were developed using Inception V3 architecture (Figure 1.). The models were also tested on a separate SP dataset (SP-test) that was not included in the development process. Area Under the Receiver Operating Characteristics Curve (AUC) along with other performance metrics were calculated and reported. Continuous data were presented as median (interquartile range), and a p-value of < 0.05 was considered to be significant. Results: Baseline analysis revealed differences between the groups with a median age of 56 years (36-68) for the Fx group, and 62 years (51-72) for the NoF group (p < 0.001). Similarly, the racial composition of the groups was also different (Fx: 84.8% white; NoF: 92.2% white; p < 0.001). Initially, the SP model showed the best performance (Youden Index (YI): 0.92, AUC: 0.99) followed by the EHR (YI: 0.74, AUC: 0.96), and combined (YI: 0.52, AUC: 0.97) models. When tested on the SP-test dataset the EHR model’s performance dropped markedly showing a YI of 0.33, an AUC of 0.78, and a sensitivity of 0.49 (Table 1.). However, the SP and combined models continued to perform optimally (YI: 0.94, AUC: 0.99; YI: 0.78, AUC: 0.98, respectively). Conclusion: This study highlights the crucial role of image quality in developing deep learning models for detecting fifth metatarsal fractures. Our findings demonstrate a markedly reduced performance of the EHR model in identifying fractures within lower-quality images. This emphasizes the need for training algorithms on images of varying quality to create more generalizable models capable of operating effectively across diverse settings.