Diagnostic and Prognostic Research (Jul 2018)

Reclassification calibration test for censored survival data: performance and comparison to goodness-of-fit criteria

  • Olga V. Demler,
  • Nina P. Paynter,
  • Nancy R. Cook

DOI
https://doi.org/10.1186/s41512-018-0034-5
Journal volume & issue
Vol. 2, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background The risk reclassification table assesses clinical performance of a biomarker in terms of movements across relevant risk categories. The Reclassification- Calibration (RC) statistic has been developed for binary outcomes, but its performance for survival data with moderate to high censoring rates has not been evaluated. Methods We develop an RC statistic for survival data with higher censoring rates using the Greenwood-Nam-D’Agostino approach (RC-GND). We examine its performance characteristics and compare its performance and utility to the Hosmer-Lemeshow goodness-of-fit test under various assumptions about the censoring rate and the shape of the baseline hazard. Results The RC-GND test was robust to high (up to 50%) censoring rates and did not exceed the targeted 5% Type I error in a variety of simulated scenarios. It achieved 80% power to detect better calibration with respect to clinical categories when an important predictor with a hazard ratio of at least 1.7 to 2.2 was added to the model, while the Hosmer-Lemeshow goodness-of-fit (gof) test had power of 5% in this scenario. Conclusions The RC-GND test should be used to test the improvement in calibration with respect to clinically relevant risk strata. When an important predictor is omitted, the Hosmer-Lemeshow goodness-of-fit test is usually not significant, while the RC-GND test is sensitive to such an omission.

Keywords