Frontiers in Oncology (Mar 2022)

Multiscale Local Enhancement Deep Convolutional Networks for the Automated 3D Segmentation of Gross Tumor Volumes in Nasopharyngeal Carcinoma: A Multi-Institutional Dataset Study

  • Geng Yang,
  • Geng Yang,
  • Geng Yang,
  • Zhenhui Dai,
  • Yiwen Zhang,
  • Yiwen Zhang,
  • Lin Zhu,
  • Junwen Tan,
  • Zefeiyun Chen,
  • Zefeiyun Chen,
  • Bailin Zhang,
  • Chunya Cai,
  • Qiang He,
  • Fei Li,
  • Xuetao Wang,
  • Wei Yang,
  • Wei Yang

DOI
https://doi.org/10.3389/fonc.2022.827991
Journal volume & issue
Vol. 12

Abstract

Read online

PurposeAccurate segmentation of gross target volume (GTV) from computed tomography (CT) images is a prerequisite in radiotherapy for nasopharyngeal carcinoma (NPC). However, this task is very challenging due to the low contrast at the boundary of the tumor and the great variety of sizes and morphologies of tumors between different stages. Meanwhile, the data source also seriously affect the results of segmentation. In this paper, we propose a novel three-dimensional (3D) automatic segmentation algorithm that adopts cascaded multiscale local enhancement of convolutional neural networks (CNNs) and conduct experiments on multi-institutional datasets to address the above problems.Materials and MethodsIn this study, we retrospectively collected CT images of 257 NPC patients to test the performance of the proposed automatic segmentation model, and conducted experiments on two additional multi-institutional datasets. Our novel segmentation framework consists of three parts. First, the segmentation framework is based on a 3D Res-UNet backbone model that has excellent segmentation performance. Then, we adopt a multiscale dilated convolution block to enhance the receptive field and focus on the target area and boundary for segmentation improvement. Finally, a central localization cascade model for local enhancement is designed to concentrate on the GTV region for fine segmentation to improve the robustness. The Dice similarity coefficient (DSC), positive predictive value (PPV), sensitivity (SEN), average symmetric surface distance (ASSD) and 95% Hausdorff distance (HD95) are utilized as qualitative evaluation criteria to estimate the performance of our automated segmentation algorithm.ResultsThe experimental results show that compared with other state-of-the-art methods, our modified version 3D Res-UNet backbone has excellent performance and achieves the best results in terms of the quantitative metrics DSC, PPR, ASSD and HD95, which reached 74.49 ± 7.81%, 79.97 ± 13.90%, 1.49 ± 0.65 mm and 5.06 ± 3.30 mm, respectively. It should be noted that the receptive field enhancement mechanism and cascade architecture can have a great impact on the stable output of automatic segmentation results with high accuracy, which is critical for an algorithm. The final DSC, SEN, ASSD and HD95 values can be increased to 76.23 ± 6.45%, 79.14 ± 12.48%, 1.39 ± 5.44mm, 4.72 ± 3.04mm. In addition, the outcomes of multi-institution experiments demonstrate that our model is robust and generalizable and can achieve good performance through transfer learning.ConclusionsThe proposed algorithm could accurately segment NPC in CT images from multi-institutional datasets and thereby may improve and facilitate clinical applications.

Keywords