BMJ Open Sport & Exercise Medicine (Dec 2022)

Predictive models for musculoskeletal injury risk: why statistical approach makes all the difference

  • Gary S Collins,
  • Deydre S Teyhen,
  • Daniel I Rhon,
  • Garrett S Bullock

DOI
https://doi.org/10.1136/bmjsem-2022-001388
Journal volume & issue
Vol. 8, no. 4

Abstract

Read online

Objective Compare performance between an injury prediction model categorising predictors and one that did not and compare a selection of predictors based on univariate significance versus assessing non-linear relationships.Methods Validation and replication of a previously developed injury prediction model in a cohort of 1466 service members followed for 1 year after physical performance, medical history and sociodemographic variables were collected. The original model dichotomised 11 predictors. The second model (M2) kept predictors continuous but assumed linearity and the third model (M3) conducted non-linear transformations. The fourth model (M4) chose predictors the proper way (clinical reasoning and supporting evidence). Model performance was assessed with R2, calibration in the large, calibration slope and discrimination. Decision curve analyses were performed with risk thresholds from 0.25 to 0.50.Results 478 personnel sustained an injury. The original model demonstrated poorer R2 (original:0.07; M2:0.63; M3:0.64; M4:0.08), calibration in the large (original:−0.11 (95% CI −0.22 to 0.00); M2: −0.02 (95% CI −0.17 to 0.13); M3:0.03 (95% CI −0.13 to 0.19); M4: −0.13 (95% CI −0.25 to –0.01)), calibration slope (original:0.84 (95% CI 0.61 to 1.07); M2:0.97 (95% CI 0.86 to 1.08); M3:0.90 (95% CI 0.75 to 1.05); M4: 081 (95% CI 0.59 to 1.03) and discrimination (original:0.63 (95% CI 0.60 to 0.66); M2:0.90 (95% CI 0.88 to 0.92); M3:0.90 (95% CI 0.88 to 0.92); M4: 0.63 (95% CI 0.60 to 0.66)). At 0.25 injury risk, M2 and M3 demonstrated a 0.43 net benefit improvement. At 0.50 injury risk, M2 and M3 demonstrated a 0.33 net benefit improvement compared with the original model.Conclusion Model performance was substantially worse in the models with dichotomised variables. This highlights the need to follow established recommendations when developing prediction models.