Cancer Medicine (Feb 2022)

Improved risk prediction of chemotherapy‐induced neutropenia—model development and validation with real‐world data

  • Mikko S. Venäläinen,
  • Eetu Heervä,
  • Outi Hirvonen,
  • Sohrab Saraei,
  • Tomi Suomi,
  • Toni Mikkola,
  • Maarit Bärlund,
  • Sirkku Jyrkkiö,
  • Tarja Laitinen,
  • Laura L. Elo

DOI
https://doi.org/10.1002/cam4.4465
Journal volume & issue
Vol. 11, no. 3
pp. 654 – 663

Abstract

Read online

Abstract Background The existing risk prediction models for chemotherapy‐induced febrile neutropenia (FN) do not necessarily apply to real‐life patients in different healthcare systems and the external validation of these models are often lacking. Our study evaluates whether a machine learning‐based risk prediction model could outperform the previously introduced models, especially when validated against real‐world patient data from another institution not used for model training. Methods Using Turku University Hospital electronic medical records, we identified all patients who received chemotherapy for non‐hematological cancer between the years 2010 and 2017 (N = 5879). An experimental surrogate endpoint was first‐cycle neutropenic infection (NI), defined as grade IV neutropenia with serum C‐reactive protein >10 mg/l. For predicting the risk of NI, a penalized regression model (Lasso) was developed. The model was externally validated in an independent dataset (N = 4594) from Tampere University Hospital. Results Lasso model accurately predicted NI risk with good accuracy (AUROC 0.84). In the validation cohort, the Lasso model outperformed two previously introduced, widely approved models, with AUROC 0.75. The variables selected by Lasso included granulocyte colony‐stimulating factor (G‐CSF) use, cancer type, pre‐treatment neutrophil and thrombocyte count, intravenous treatment regimen, and the planned dose intensity. The same model predicted also FN, with AUROC 0.77, supporting the validity of NI as an endpoint. Conclusions Our study demonstrates that real‐world NI risk prediction can be improved with machine learning and that every difference in patient or treatment characteristics can have a significant impact on model performance. Here we outline a novel, externally validated approach which may hold potential to facilitate more targeted use of G‐CSFs in the future.

Keywords