Acta Orthopaedica et Traumatologica Turcica (Jan 2024)
Comparison of artificial intelligence algorithm for the diagnosis of hip fracture on plain radiography with decision-making physicians: a validation study
Abstract
Objective: This study aimed to compare an algorithm developed for diagnosing hip fractures on plain radiographs with the physicians involved in diagnosing hip fractures. Methods: Radiographs labeled as fractured (n=182) and non-fractured (n=542) by an expert on proximal femur fractures were included in the study. General practitioners in the emergency department (n=3), emergency medicine (n=3), radiologists (n=3), orthopedic residents (n=3), and orthopedic surgeons (n=3) were included in the study as the labelers, who labeled the presence of fractures on the right and left sides of the proximal femoral region on each anteroposterior (AP) plain pelvis radiograph as fractured or non-fractured. In addition, all the radiographs were evaluated using an artificial intelligence (AI) algorithm consisting of 3 AI models and a majority voting technique. Each AI model evaluated each graph separately, and majority voting determined the final decision as the majority of the outputs of the 3 AI models. The results of the AI algorithm and labelling physicians included in the study were compared with the reference evaluation. Results: Based on F-1 scores, here are the average scores of the group: majority voting (0.942) > orthopedic surgeon (0.938) > AI models (0.917) > orthopedic resident (0.858) > emergency medicine (0.758) > general practitioner (0.689) > radiologist (0.677). Conclusion: The AI algorithm developed in our previous study may help recognize fractures in AP pelvis in plain radiography in the emergency department for non-orthopedist physicians. Level of Evidence: Level IV, Diagnostic Study.