JBJS Open Access (Mar 2023)

Efficiency and Accuracy of Computerized Adaptive Testing for the Oswestry Disability Index and Neck Disability Index

  • Tracy Y. Zhu, PhD,
  • Otho R. Plummer, PhD,
  • Audrey Hunt, BS,
  • Alexander Joeris, MD, MSc

DOI
https://doi.org/10.2106/JBJS.OA.22.00036
Journal volume & issue
Vol. 8, no. 1

Abstract

Read online

Background:. This study aimed to determine the efficiency and accuracy of computerized adaptive testing (CAT) models of the Oswestry Disability Index (ODI) and Neck Disability Index (NDI). Methods:. The study involved simulation using retrospectively collected real-world data. Previously developed CAT models of the ODI and NDI were applied to the responses from 52,551 and 18,196 patients with spinal conditions, respectively. Efficiency was evaluated by the reduction in the number of questions administered. Accuracy was evaluated by comparing means and standard deviations, calculating Pearson r and intraclass correlation coefficient (ICC) values, plotting the frequency distributions of CAT and full questionnaire scores, plotting the frequency distributions of differences between paired scores, and Bland-Altman plotting. Score changes, calculated as the postoperative ODI or NDI scores minus the preoperative scores, were compared between the CAT and full versions in patients for whom both preoperative and postoperative ODI or NDI questionnaires were available. Results:. CAT models of the ODI and NDI required an average of 4.47 and 4.03 fewer questions per patient, respectively. The mean CAT ODI score was 0.7 point lower than the full ODI score (35.4 ± 19.0 versus 36.1 ± 19.3), and the mean CAT NDI score was 1.0 point lower than the full NDI score (34.7 ± 19.3 versus 33.8 ± 18.5). The Pearson r was 0.97 for both the ODI and NDI, and the ICC was 0.97 for both. The frequency distributions of the CAT and full scores showed marked overlap for the ODI and NDI. Differences between paired scores were less than the minimum clinically important difference in 98.9% of cases for the ODI and 98.5% for the NDI. Bland-Altman plots showed no proportional bias. The ODI and NDI score changes could be calculated in a subgroup of 6,044 and 4,775 patients, respectively; the distributions of the ODI and NDI score changes were near identical between the CAT and full versions. Conclusions:. CAT models were able to reduce the question burden of the ODI and NDI. Scores obtained from the CAT models were faithful to those from the full questionnaires, both on the population level and on the individual patient level. Level of Evidence:. Prognostic Level III. See Instructions for Authors for a complete description of levels of evidence.