Scientific Reports (Jun 2024)

Explainable prediction model for the human papillomavirus status in patients with oropharyngeal squamous cell carcinoma using CNN on CT images

  • Annarita Fanizzi,
  • Maria Colomba Comes,
  • Samantha Bove,
  • Elisa Cavalera,
  • Paola de Franco,
  • Alessia Di Rito,
  • Angelo Errico,
  • Marco Lioce,
  • Francesca Pati,
  • Maurizio Portaluri,
  • Concetta Saponaro,
  • Giovanni Scognamillo,
  • Ippolito Troiano,
  • Michele Troiano,
  • Francesco Alfredo Zito,
  • Raffaella Massafra

DOI
https://doi.org/10.1038/s41598-024-65240-9
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 8

Abstract

Read online

Abstract Several studies have emphasised how positive and negative human papillomavirus (HPV+ and HPV−, respectively) oropharyngeal squamous cell carcinoma (OPSCC) has distinct molecular profiles, tumor characteristics, and disease outcomes. Different radiomics-based prediction models have been proposed, by also using innovative techniques such as Convolutional Neural Networks (CNNs). Although some of these models reached encouraging predictive performances, there evidence explaining the role of radiomic features in achieving a specific outcome is scarce. In this paper, we propose some preliminary results related to an explainable CNN-based model to predict HPV status in OPSCC patients. We extracted the Gross Tumor Volume (GTV) of pre-treatment CT images related to 499 patients (356 HPV+ and 143 HPV−) included into the OPC-Radiomics public dataset to train an end-to-end Inception-V3 CNN architecture. We also collected a multicentric dataset consisting of 92 patients (43 HPV+ , 49 HPV−), which was employed as an independent test set. Finally, we applied Gradient-weighted Class Activation Mapping (Grad-CAM) technique to highlight the most informative areas with respect to the predicted outcome. The proposed model reached an AUC value of 73.50% on the independent test. As a result of the Grad-CAM algorithm, the most informative areas related to the correctly classified HPV+ patients were located into the intratumoral area. Conversely, the most important areas referred to the tumor edges. Finally, since the proposed model provided additional information with respect to the accuracy of the classification given by the visualization of the areas of greatest interest for predictive purposes for each case examined, it could contribute to increase confidence in using computer-based predictive models in the actual clinical practice.

Keywords