Frontiers in Oncology (Jan 2022)

Multi-Institutional Validation of Two-Streamed Deep Learning Method for Automated Delineation of Esophageal Gross Tumor Volume Using Planning CT and FDG-PET/CT

  • Xianghua Ye,
  • Dazhou Guo,
  • Chen-Kan Tseng,
  • Jia Ge,
  • Tsung-Min Hung,
  • Ping-Ching Pai,
  • Yanping Ren,
  • Lu Zheng,
  • Xinli Zhu,
  • Ling Peng,
  • Ying Chen,
  • Xiaohua Chen,
  • Chen-Yu Chou,
  • Danni Chen,
  • Jiaze Yu,
  • Yuzhen Chen,
  • Feiran Jiao,
  • Yi Xin,
  • Lingyun Huang,
  • Guotong Xie,
  • Jing Xiao,
  • Le Lu,
  • Senxiang Yan,
  • Dakai Jin,
  • Tsung-Ying Ho

DOI
https://doi.org/10.3389/fonc.2021.785788
Journal volume & issue
Vol. 11

Abstract

Read online

BackgroundThe current clinical workflow for esophageal gross tumor volume (GTV) contouring relies on manual delineation with high labor costs and inter-user variability.PurposeTo validate the clinical applicability of a deep learning multimodality esophageal GTV contouring model, developed at one institution whereas tested at multiple institutions.Materials and MethodsWe collected 606 patients with esophageal cancer retrospectively from four institutions. Among them, 252 patients from institution 1 contained both a treatment planning CT (pCT) and a pair of diagnostic FDG-PET/CT; 354 patients from three other institutions had only pCT scans under different staging protocols or lacking PET scanners. A two-streamed deep learning model for GTV segmentation was developed using pCT and PET/CT scans of a subset (148 patients) from institution 1. This built model had the flexibility of segmenting GTVs via only pCT or pCT+PET/CT combined when available. For independent evaluation, the remaining 104 patients from institution 1 behaved as an unseen internal testing, and 354 patients from the other three institutions were used for external testing. Degrees of manual revision were further evaluated by human experts to assess the contour-editing effort. Furthermore, the deep model’s performance was compared against four radiation oncologists in a multi-user study using 20 randomly chosen external patients. Contouring accuracy and time were recorded for the pre- and post-deep learning-assisted delineation process.

Keywords