A practical guide to the implementation of AI in orthopaedic research, Part 6: How to evaluate the performance of AI research?

Felix C. Oettl; Ayoosh Pareek; Philipp W. Winkler; Bálint Zsidai; James A. Pruneski; Eric Hamrin Senorski; Sebastian Kopf; Christophe Ley; Elmar Herbst; Jacob F. Oeding; Alberto Grassi; Michael T. Hirschmann; Volker Musahl; Kristian Samuelsson; Thomas Tischer; Robert Feldt; ESSKA Artificial Intelligence Working Group

doi:10.1002/jeo2.12039

Journal of Experimental Orthopaedics (Jul 2024)

A practical guide to the implementation of AI in orthopaedic research, Part 6: How to evaluate the performance of AI research?

Felix C. Oettl,
Ayoosh Pareek,
Philipp W. Winkler,
Bálint Zsidai,
James A. Pruneski,
Eric Hamrin Senorski,
Sebastian Kopf,
Christophe Ley,
Elmar Herbst,
Jacob F. Oeding,
Alberto Grassi,
Michael T. Hirschmann,
Volker Musahl,
Kristian Samuelsson,
Thomas Tischer,
Robert Feldt,
ESSKA Artificial Intelligence Working Group

Affiliations

Felix C. Oettl: Hospital for Special Surgery New York New York USA
Ayoosh Pareek: Sports Medicine and Shoulder Institute, Hospital for Special Surgery New York New York USA
Philipp W. Winkler: Department for Orthopaedics and Traumatology, Kepler University Hospital GmbH Johannes Kepler University Linz Linz Austria
Bálint Zsidai: Department of Orthopaedics, Institute of Clinical Sciences, Sahlgrenska Academy University of Gothenburg Gothenburg Sweden
James A. Pruneski: Department of Orthopaedic Surgery Tripler Army Medical Center Honolulu Hawaii USA
Eric Hamrin Senorski: Sahlgrenska Sports Medicine Center Göteborg Sweden
Sebastian Kopf: Center of Orthopaedics and Traumatology, University Hospital Brandenburg an der Havel, Brandenburg Medical School Theodor Fontane Germany
Christophe Ley: Department of Mathematics University of Luxembourg Esch‐sur‐Alzette Luxembourg
Elmar Herbst: Department of Trauma, Hand and Reconstructive Surgery University Hospital Muenster Muenster Germany
Jacob F. Oeding: Department of Orthopaedics, Institute of Clinical Sciences, Sahlgrenska Academy University of Gothenburg Gothenburg Sweden
Alberto Grassi: IIa Clinica Ortopedica e Traumatologica, IRCCS Istituto Ortopedico Rizzoli Bologna Italy
Michael T. Hirschmann: Department of Orthopaedic Surgery and Traumatology Kantonsspital Baselland Bruderholz Switzerland
Volker Musahl: Department of Orthopaedic Surgery, UPMC Freddie Fu Sports Medicine Center University of Pittsburgh Pittsburgh Pennsylvania USA
Kristian Samuelsson: Department of Orthopaedics, Institute of Clinical Sciences, Sahlgrenska Academy University of Gothenburg Gothenburg Sweden
Thomas Tischer: Department of Orthopaedic Surgery Universitymedicine Rostock Rostock Germany
Robert Feldt: Department of Computer Science and Engineering Chalmers University of Technology Gothenburg Sweden
ESSKA Artificial Intelligence Working Group

DOI: https://doi.org/10.1002/jeo2.12039
Journal volume & issue: Vol. 11, no. 3
pp. n/a – n/a

Abstract

Read online

Abstract Artificial intelligence's (AI) accelerating progress demands rigorous evaluation standards to ensure safe, effective integration into healthcare's high‐stakes decisions. As AI increasingly enables prediction, analysis and judgement capabilities relevant to medicine, proper evaluation and interpretation are indispensable. Erroneous AI could endanger patients; thus, developing, validating and deploying medical AI demands adhering to strict, transparent standards centred on safety, ethics and responsible oversight. Core considerations include assessing performance on diverse real‐world data, collaborating with domain experts, confirming model reliability and limitations, and advancing interpretability. Thoughtful selection of evaluation metrics suited to the clinical context along with testing on diverse data sets representing different populations improves generalisability. Partnering software engineers, data scientists and medical practitioners ground assessment in real needs. Journals must uphold reporting standards matching AI's societal impacts. With rigorous, holistic evaluation frameworks, AI can progress towards expanding healthcare access and quality. Level of Evidence Level V.

Published in Journal of Experimental Orthopaedics

ISSN: 2197-1153 (Online)
Publisher: Wiley
Country of publisher: United States
LCC subjects: Medicine: Surgery: Orthopedic surgery
Website: https://esskajournals.onlinelibrary.wiley.com/journal/21971153

About the journal

Abstract

Keywords