BMJ Open (Mar 2023)

Temporal validation of a multivariable surgical mortality prediction model (NZRisk): a New Zealand national cohort study

  • Marta Seretny,
  • Doug Campbell,
  • Luke Boyle,
  • Thomas Lumley

DOI
https://doi.org/10.1136/bmjopen-2022-069911
Journal volume & issue
Vol. 13, no. 3

Abstract

Read online

Objectives Clinical risk calculators (CRCs), such as NZRisk, are used daily by clinicians to guide clinical decisions and explain individual risk to patients. The utility and robustness of these tools depends on the methods used to create the underlying mathematical model, as well as the stability of that model in relation to changing clinical practice and patient populations over time. The later should be checked by temporal validation using external data. Few if any of the clinical prediction models in current clinical use have published temporal validation. Here, we use a large external dataset to temporally validate NZRisk; a perioperative risk prediction model used in the New Zealand population.Methods A sample of 1 976 362 adult non-cardiac surgical procedures collected over 15 years from the New Zealand Ministry of Health National Minimum Dataset, was used to temporally validate NZRisk. We divided the dataset into 15 single year cohorts and compared 13 of these to our NZRisk model (2 years used for the model building were excluded). We compared the area under the curve (AUC) value, calibration slope and intercept for each single year cohort, to the same values produced by the data used to create NZRisk, by fitting a random effects meta-regression with each year cohort acting as a separate study point. In addition, we used two-sided t-tests to compare each measure across the cohorts.Results The AUC values for the 30-day NZRisk model applied to our single year cohorts ranged from 0.918 to 0.940 (NZRisk AUC was 0.921). There were eight statistically different AUC values for the following years 2007–2009, 2016 and 2018–2021. The intercept values ranged from −0.004 to 0.007 and 7 years had statistically significant different intercepts during leave-one-out t-tests; 2007–2010, 2012, 2018 and 2021. The slope values ranged from 0.72 to 1.12 and 7 years had statistically significant different slopes during leave-one-out t-tests; 2010, 2011, 2017, 2018 and 2019–2021. The random effects meta-regression upheld our results related to AUC (0.54 (95% CI 0.40 to 0.99), I2 67.57 (95% CI 40.67 to 88.50), Cochran’s Q<0.001) and slope (τ 0.14 (95% CI 0.01 to 0.23), I2 98.61 (95% CI 97.31 to 99.50), Cochran’s Q<0.001) between year difference.Conclusion The NZRisk model shows differences in AUC and slope but not intercept values over time. The biggest differences were in the calibration slope. The models maintained excellent discrimination over time as shown by the AUC values. These findings suggest we update our model in the next 5 years. To our knowledge, this is the first temporal validation of a CRC in current use.