Computerized adaptive testing for the Oxford Hip, Knee, Shoulder, and Elbow scores <subtitle>accurate measurement from fewer, and more patient-focused, questions</subtitle>

Conrad J. Harrison; Otho R. Plummer; Jill Dawson; Crispin Jenkinson; Audrey Hunt; Jeremy N. Rodrigues

doi:10.1302/2633-1462.310.BJO-2022-0073.R1

Bone & Joint Open (Oct 2022)

Computerized adaptive testing for the Oxford Hip, Knee, Shoulder, and Elbow scores <subtitle>accurate measurement from fewer, and more patient-focused, questions</subtitle>

Conrad J. Harrison,
Otho R. Plummer,
Jill Dawson,
Crispin Jenkinson,
Audrey Hunt,
Jeremy N. Rodrigues

Affiliations

Conrad J. Harrison: Methodology Oxford Limited, London, UK
Otho R. Plummer: Universal Research Solutions, Columbia, Missouri, USA
Jill Dawson: Nuffield Department of Population Health, University of Oxford, Oxford, UK
Crispin Jenkinson: Nuffield Department of Population Health, University of Oxford, Oxford, UK
Audrey Hunt: Universal Research Solutions, Columbia, Missouri, USA
Jeremy N. Rodrigues: Methodology Oxford Limited, London, UK

DOI: https://doi.org/10.1302/2633-1462.310.BJO-2022-0073.R1
Journal volume & issue: Vol. 3, no. 10
pp. 786 – 794

Abstract

Read online

AimsThe aim of this study was to develop and evaluate machine-learning-based computerized adaptive tests (CATs) for the Oxford Hip Score (OHS), Oxford Knee Score (OKS), Oxford Shoulder Score (OSS), and the Oxford Elbow Score (OES) and its subscales.MethodsWe developed CAT algorithms for the OHS, OKS, OSS, overall OES, and each of the OES subscales, using responses to the full-length questionnaires and a machine-learning technique called regression tree learning. The algorithms were evaluated through a series of simulation studies, in which they aimed to predict respondents’ full-length questionnaire scores from only a selection of their item responses. In each case, the total number of items used by the CAT algorithm was recorded and CAT scores were compared to full-length questionnaire scores by mean, SD, score distribution plots, Pearson’s correlation coefficient, intraclass correlation (ICC), and the Bland-Altman method. Differences between CAT scores and full-length questionnaire scores were contextualized through comparison to the instruments’ minimal clinically important difference (MCID).ResultsThe CAT algorithms accurately estimated 12-item questionnaire scores from between four and nine items. Scores followed a very similar distribution between CAT and full-length assessments, with the mean score difference ranging from 0.03 to 0.26 out of 48 points. Pearson’s correlation coefficient and ICC were 0.98 for each 12-item scale and 0.95 or higher for the OES subscales. In over 95% of cases, a patient’s CAT score was within five points of the full-length questionnaire score for each 12-item questionnaire.ConclusionOxford Hip Score, Oxford Knee Score, Oxford Shoulder Score, and Oxford Elbow Score (including separate subscale scores) CATs all markedly reduce the burden of items to be completed without sacrificing score accuracy.Cite this article: Bone Jt Open 2022;3(10):786–794.

Published in Bone & Joint Open

ISSN: 2633-1462 (Online)
Publisher: The British Editorial Society of Bone & Joint Surgery
Country of publisher: United Kingdom
LCC subjects: Medicine: Surgery: Orthopedic surgery
Website: https://online.boneandjoint.org.uk/journal/bjo

About the journal

Abstract

Keywords