Journal of Experimental Orthopaedics (Jul 2024)

ChatGPT can yield valuable responses in the context of orthopaedic trauma surgery

  • Janina Kaarre,
  • Robert Feldt,
  • Bálint Zsidai,
  • Eric Hamrin Senorski,
  • Emilia Möller Rydberg,
  • Olof Wolf,
  • Sebastian Mukka,
  • Michael Möller,
  • Kristian Samuelsson

DOI
https://doi.org/10.1002/jeo2.12047
Journal volume & issue
Vol. 11, no. 3
pp. n/a – n/a

Abstract

Read online

Abstract Purpose To assess the possibility of using Generative Pretrained Transformer (ChatGPT) specifically in the context of orthopaedic trauma surgery by questions posed to ChatGPT and to evaluate responses (correctness, completeness and adaptiveness) by orthopaedic trauma surgeons. Methods ChatGPT (GPT‐4 of 12 May 2023) was asked to address 34 common orthopaedic trauma surgery‐related questions and generate responses suited to three target groups: patient, nonorthopaedic medical doctor and expert orthopaedic surgeon. Three orthopaedic trauma surgeons independently assessed ChatGPT's responses by using a three‐point response scale with a response range between 0 and 2, where a higher number indicates better performance (correctness, completeness and adaptiveness). Results A total of 18 (52.9%) of all responses were assessed to be correct (2.0) for the patient target group, while 22 (64.7%) and 24 (70.5%) of the responses were determined to be correct for nonorthopaedic medical doctors and expert orthopaedic surgeons, respectively. Moreover, a total of 18 (52.9%), 25 (73.5%) and 28 (82.4%) of the responses were assessed to be complete (2.0) for patients, nonorthopaedic medical doctors and expert orthopaedic surgeons, respectively. The average adaptiveness was 1.93, 1.95 and 1.97 for patients, nonorthopaedic medical doctors and expert orthopaedic surgeons, respectively. Conclusion The study results indicate that ChatGPT can yield valuable and overall correct responses in the context of orthopaedic trauma surgery across different target groups, which encompassed patients, nonorthopaedic medical surgeons and expert orthopaedic surgeons. The average correctness scores, completeness levels and adaptiveness values indicated the ability of ChatGPT to generate overall correct and complete responses adapted to the target group. Level of Evidence Not applicable.

Keywords