BMC Medical Imaging (Feb 2025)

Towards automatical tumor segmentation in radiomics: a comparative analysis of various methods and radiologists for both region extraction and downstream diagnosis

  • Ying Yu,
  • Gang-Feng Li,
  • Wei-Xiong Tan,
  • Xiao-Yan Qu,
  • Tao Zhang,
  • Xing-Yi Hou,
  • Yuan-Bo Zhu,
  • Zhi-Ying Ma,
  • Lu Yang,
  • Ya Gao,
  • Mei Yu,
  • Cui Yue,
  • Zhen Zhou,
  • Yang Yang,
  • Lin-Feng Yan,
  • Guang-Bin Cui

DOI
https://doi.org/10.1186/s12880-025-01596-2
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Objective By discussing the difference, stability and classification ability of tumor contour extracted by artificial intelligence and doctors, can a more stable method of tumor contour extraction be obtained? Methods We propose a novel framework for the automatic segmentation of lung tumor contours and the differential diagnosis of downstream tasks. This framework integrates four key modules: tumor segmentation, extraction of radiomic features, feature selection, and the development of diagnostic models for clinical applications. Using this framework, we conducted a study involving a cohort of 1,429 patients suspected of lung cancer. Four automatic segmentation methods (RNN, UNET, WFCM, and SNAKE) were evaluated against manual segmentation performed by three radiologists with varying levels of expertise. We further studied the consistency of radiomic features extracted from these methods and evaluates their diagnostic performance across three downstream tasks: benign vs. malignant classification, lung adenocarcinoma infiltration, and lung nodule density classification. Results The Dice coefficient of RNN is the highest among the four automatic segmentation methods (0.803 > 0.751, 0.576, 0.560), and all P < 0.05. In the consistency comparison of the seven contour-extracted radiomic features, that the features extracted by RNN and S1 (the senior radiologist) showed the highest similarity which was higher than the other automatic segmentation methods and doctors with low seniority. In all three downstream tasks, the radiomic features extracted from RNN segmentation contours showed the highest diagnostic discrimination. In the classification of benign and malignant nodules, the RNN method performed slightly better than the S1 method, with an AUC of 0.840 ± 0.01 and 0.824 ± 0.015, respectively, and significantly better than the other five methods. Similarly, the RNN method had an AUC value of 0.946 in lung adenocarcinoma infiltration, and a kappa value of 0.729 in lung nodule density classification, both of which were better than the other six methods. Conclusions Our findings suggest that AI-driven tumor segmentation methods can enhance clinical decision-making by providing reliable and reproducible results, ultimately emphasizing the auxiliary role of automated tumor contouring in clinical practice. The findings will have important implications for the application of radiomics in clinical practice.

Keywords