Journal of Medical Internet Research (Apr 2022)

Optimal Treatment Selection in Sequential Systemic and Locoregional Therapy of Oropharyngeal Squamous Carcinomas: Deep Q-Learning With a Patient-Physician Digital Twin Dyad

  • Elisa Tardini,
  • Xinhua Zhang,
  • Guadalupe Canahuate,
  • Andrew Wentzel,
  • Abdallah S R Mohamed,
  • Lisanne Van Dijk,
  • Clifton D Fuller,
  • G Elisabeta Marai

DOI
https://doi.org/10.2196/29455
Journal volume & issue
Vol. 24, no. 4
p. e29455

Abstract

Read online

BackgroundCurrently, selection of patients for sequential versus concurrent chemotherapy and radiation regimens lacks evidentiary support and it is based on locally optimal decisions for each step. ObjectiveWe aim to optimize the multistep treatment of patients with head and neck cancer and predict multiple patient survival and toxicity outcomes, and we develop, apply, and evaluate a first application of deep Q-learning (DQL) and simulation to this problem. MethodsThe treatment decision DQL digital twin and the patient’s digital twin were created, trained, and evaluated on a data set of 536 patients with oropharyngeal squamous cell carcinoma with the goal of, respectively, determining the optimal treatment decisions with respect to survival and toxicity metrics and predicting the outcomes of the optimal treatment on the patient. Of the data set of 536 patients, the models were trained on a subset of 402 (75%) patients (split randomly) and evaluated on a separate set of 134 (25%) patients. Training and evaluation of the digital twin dyad was completed in August 2020. The data set includes 3-step sequential treatment decisions and complete relevant history of the patient cohort treated at MD Anderson Cancer Center between 2005 and 2013, with radiomics analysis performed for the segmented primary tumor volumes. ResultsOn the test set, we found mean 87.35% (SD 11.15%) and median 90.85% (IQR 13.56%) accuracies in treatment outcome prediction, matching the clinicians’ outcomes and improving the (predicted) survival rate by +3.73% (95% CI –0.75% to 8.96%) and the dysphagia rate by +0.75% (95% CI –4.48% to 6.72%) when following DQL treatment decisions. ConclusionsGiven the prediction accuracy and predicted improvement regarding the medically relevant outcomes yielded by this approach, this digital twin dyad of the patient-physician dynamic treatment problem has the potential of aiding physicians in determining the optimal course of treatment and in assessing its outcomes.