Clinical and Translational Radiation Oncology (Jul 2023)

Benchmarking machine learning approaches to predict radiation-induced toxicities in lung cancer patients

  • Francisco J. Núñez-Benjumea,
  • Sara González-García,
  • Alberto Moreno-Conde,
  • José C. Riquelme-Santos,
  • José L. López-Guerra

Journal volume & issue
Vol. 41
p. 100640

Abstract

Read online

Background and purpose: Radiation-induced toxicities are common adverse events in lung cancer (LC) patients undergoing radiotherapy (RT). An accurate prediction of these adverse events might facilitate an informed and shared decision-making process between patient and radiation oncologist with a clearer view of life-balance implications in treatment choices. This work provides a benchmark of machine learning (ML) approaches to predict radiation-induced toxicities in LC patients built upon a real-world health dataset based on a generalizable methodology for their implementation and external validation. Materials and Methods: Ten feature selection (FS) methods were combined with five ML-based classifiers to predict six RT-induced toxicities (acute esophagitis, acute cough, acute dyspnea, acute pneumonitis, chronic dyspnea, and chronic pneumonitis). A real-world health dataset (RWHD) built from 875 consecutive LC patients was used to train and validate the resulting 300 predictive models. Internal and external accuracy was calculated in terms of AUC per clinical endpoint, FS method, and ML-based classifier under analysis. Results: Best performing predictive models obtained per clinical endpoint achieved comparable performances to methods from state-of-the-art at internal validation (AUC ≥ 0.81 in all cases) and at external validation (AUC ≥ 0.73 in 5 out of 6 cases). Conclusion: A benchmark of 300 different ML-based approaches has been tested against a RWHD achieving satisfactory results following a generalizable methodology. The outcomes suggest potential relationships between underrecognized clinical factors and the onset of acute esophagitis or chronic dyspnea, thus demonstrating the potential that ML-based approaches have to generate novel data-driven hypotheses in the field.

Keywords