JBJS Open Access (Mar 2020)
Use of Computerized Adaptive Testing to Develop More Concise Patient-Reported Outcome Measures
Abstract
Background:. Patient-reported outcome measures (PROMs) are essential tools that are used to assess health status and treatment outcomes in orthopaedic care. Use of PROMs can burden patients with lengthy and cumbersome questionnaires. Predictive models using machine learning known as computerized adaptive testing (CAT) offer a potential solution. The purpose of this study was to evaluate the ability of CAT to improve efficiency of the Veterans RAND 12 Item Health Survey (VR-12) by decreasing the question burden while maintaining the accuracy of the outcome score. Methods:. A previously developed CAT model was applied to the responses of 19,523 patients who had completed a full VR-12 survey while presenting to 1 of 5 subspecialty orthopaedic clinics. This resulted in the calculation of both a full-survey and CAT-model physical component summary score (PCS) and mental component summary score (MCS). Several analyses compared the accuracy of the CAT model scores with that of the full scores by comparing the means and standard deviations, calculating a Pearson correlation coefficient and intraclass correlation coefficient, plotting the frequency distributions of the 2 score sets and the score differences, and performing a Bland-Altman assessment of scoring patterns. Results:. The CAT model required 4 fewer questions to be answered by each subject (33% decrease in question burden). The mean PCS was 1.3 points lower in the CAT model than with the full VR-12 (41.5 ± 11.0 versus 42.8 ± 10.4), and the mean MCS was 0.3 point higher (57.3 ± 9.4 versus 57.0 ± 9.6). The Pearson correlation coefficients were 0.97 for PCS and 0.98 for MCS, and the intraclass correlation coefficients were 0.96 and 0.97, respectively. The frequency distribution of the CAT and full scores showed significant overlap for both the PCS and the MCS. The difference between the CAT and full scores was less than the minimum clinically important difference (MCID) in >95% of cases for the PCS and MCS. Conclusions:. The application of CAT to the VR-12 survey demonstrated an ability to lessen the response burden for patients with a negligible effect on score integrity.