Frontiers in Medicine (Aug 2023)

Head and neck cancer treatment outcome prediction: a comparison between machine learning with conventional radiomics features and deep learning radiomics

  • Bao Ngoc Huynh,
  • Aurora Rosvoll Groendahl,
  • Oliver Tomic,
  • Kristian Hovde Liland,
  • Ingerid Skjei Knudtsen,
  • Ingerid Skjei Knudtsen,
  • Frank Hoebers,
  • Frank Hoebers,
  • Wouter van Elmpt,
  • Wouter van Elmpt,
  • Eirik Malinen,
  • Eirik Malinen,
  • Einar Dale,
  • Cecilia Marie Futsaether

DOI
https://doi.org/10.3389/fmed.2023.1217037
Journal volume & issue
Vol. 10

Abstract

Read online

BackgroundRadiomics can provide in-depth characterization of cancers for treatment outcome prediction. Conventional radiomics rely on extraction of image features within a pre-defined image region of interest (ROI) which are typically fed to a classification algorithm for prediction of a clinical endpoint. Deep learning radiomics allows for a simpler workflow where images can be used directly as input to a convolutional neural network (CNN) with or without a pre-defined ROI.PurposeThe purpose of this study was to evaluate (i) conventional radiomics and (ii) deep learning radiomics for predicting overall survival (OS) and disease-free survival (DFS) for patients with head and neck squamous cell carcinoma (HNSCC) using pre-treatment 18F-fluorodeoxuglucose positron emission tomography (FDG PET) and computed tomography (CT) images.Materials and methodsFDG PET/CT images and clinical data of patients with HNSCC treated with radio(chemo)therapy at Oslo University Hospital (OUS; n = 139) and Maastricht University Medical Center (MAASTRO; n = 99) were collected retrospectively. OUS data was used for model training and initial evaluation. MAASTRO data was used for external testing to assess cross-institutional generalizability. Models trained on clinical and/or conventional radiomics features, with or without feature selection, were compared to CNNs trained on PET/CT images without or with the gross tumor volume (GTV) included. Model performance was measured using accuracy, area under the receiver operating characteristic curve (AUC), Matthew’s correlation coefficient (MCC), and the F1 score calculated for both classes separately.ResultsCNNs trained directly on images achieved the highest performance on external data for both endpoints. Adding both clinical and radiomics features to these image-based models increased performance further. Conventional radiomics including clinical data could achieve competitive performance. However, feature selection on clinical and radiomics data lead to overfitting and poor cross-institutional generalizability. CNNs without tumor and node contours achieved close to on-par performance with CNNs including contours.ConclusionHigh performance and cross-institutional generalizability can be achieved by combining clinical data, radiomics features and medical images together with deep learning models. However, deep learning models trained on images without contours can achieve competitive performance and could see potential use as an initial screening tool for high-risk patients.

Keywords