Scientific Reports (Feb 2022)

Cross-institutional outcome prediction for head and neck cancer patients using self-attention neural networks

  • William Trung Le,
  • Eugene Vorontsov,
  • Francisco Perdigón Romero,
  • Lotfi Seddik,
  • Mohamed Mortada Elsharief,
  • Phuc Felix Nguyen-Tan,
  • David Roberge,
  • Houda Bahig,
  • Samuel Kadoury

DOI
https://doi.org/10.1038/s41598-022-07034-5
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 17

Abstract

Read online

Abstract In radiation oncology, predicting patient risk stratification allows specialization of therapy intensification as well as selecting between systemic and regional treatments, all of which helps to improve patient outcome and quality of life. Deep learning offers an advantage over traditional radiomics for medical image processing by learning salient features from training data originating from multiple datasets. However, while their large capacity allows to combine high-level medical imaging data for outcome prediction, they lack generalization to be used across institutions. In this work, a pseudo-volumetric convolutional neural network with a deep preprocessor module and self-attention (PreSANet) is proposed for the prediction of distant metastasis, locoregional recurrence, and overall survival occurrence probabilities within the 10 year follow-up time frame for head and neck cancer patients with squamous cell carcinoma. The model is capable of processing multi-modal inputs of variable scan length, as well as integrating patient data in the prediction model. These proposed architectural features and additional modalities all serve to extract additional information from the available data when availability to additional samples is limited. This model was trained on the public Cancer Imaging Archive Head–Neck-PET–CT dataset consisting of 298 patients undergoing curative radio/chemo-radiotherapy and acquired from 4 different institutions. The model was further validated on an internal retrospective dataset with 371 patients acquired from one of the institutions in the training dataset. An extensive set of ablation experiments were performed to test the utility of the proposed model characteristics, achieving an AUROC of $$80\%$$ 80 % , $$80\%$$ 80 % and $$82\%$$ 82 % for DM, LR and OS respectively on the public TCIA Head–Neck-PET–CT dataset. External validation was performed on a retrospective dataset with 371 patients, achieving $$69\%$$ 69 % AUROC in all outcomes. To test for model generalization across sites, a validation scheme consisting of single site-holdout and cross-validation combining both datasets was used. The mean accuracy across 4 institutions obtained was $$72\%$$ 72 % , $$70\%$$ 70 % and $$71\%$$ 71 % for DM, LR and OS respectively. The proposed model demonstrates an effective method for tumor outcome prediction for multi-site, multi-modal combining both volumetric data and structured patient clinical data.