陆军军医大学学报 (Jul 2025)

Development of a postoperative recurrence prediction model for stage Ⅰ non-small cell lung cancer patients using multimodal data based on machine learning

  • ZHANG Di ,
  • WU Yi,
  • XU Yu

DOI
https://doi.org/10.16016/j.2097-0927.202410117
Journal volume & issue
Vol. 47, no. 14
pp. 1602 – 1611

Abstract

Read online

Objective To develop a machine learning model integrating preoperative chest CT radiomic features with clinical data for predicting 5-year postoperative recurrence risk in stage Ⅰ non-small cell lung cancer (NSCLC) patients undergoing surgical resection. Methods A total of 217 patients with pathologically confirmed stage Ⅰ NSCLC (selected from 778 initially screened cases based on our inclusion and exclusion criteria) treated in Army Medical Center of PLA between January 2014 and December 2019 were retrospectively enrolled, including 53 recurrence cases and 164 non-recurrence cases within 5-year follow-up. They were randomly divided into a training set (n=173) and a validation set (n=44) in a ratio of 8:2. Radiomic models were established based on extracted features from tumor-dominant regions of interest (ROI) on CT images, while clinical models were developed using demographic characteristics and preoperative laboratory examinations. A combined model was further constructed by integrating both feature sets, and model performance was compared to identify the optimal predictive model.Results‍ ‍This study screened the features from non-contrast CT images and ultimately selected 7 radiomic features for constructing radiomic model. Among 6 machine learning algorithms, the adaptive boosting (Adaboost) model demonstrated the best overall predictive performance, with an area under the curve (AUC) of 0.866 (95% CI: 0.808~0.923; accuracy: 0.832, specificity: 0.884) in the training set and of 0.806 (95% CI: 0.630~0.983; accuracy: 0.795, specificity: 0.971) in the validation set. Univariate and multivariate logistic regression analyses identified 4 clinical features for clinical model construction. The clinical model achieved an AUC value of 0.874 (95% CI: 0.821~0.928; accuracy: 0.827, specificity: 0.891) in the training set and 0.813 (95% CI: 0.677~0.948; accuracy: 0.636, specificity: 0.600) in the validation set. By integrating the 7 radiomic features and 4 clinical features using a feature-level fusion strategy, the combined model exhibited further improved predictive performance, with an AUC value of 0.953 (95% CI: 0.924~0.983; accuracy: 0.884, specificity: 0.860) and 0.852 (95% CI: 0.729~0.976; accuracy: 0.682, specificity: 0.629), respectively in the training set and the validation set. Conclusion‍ ‍The combined model integrating preoperative CT radiomic features with clinical risk factors may provide an evidence-based framework for evaluating 5-year postoperative recurrence risk in stage Ⅰ NSCLC patients.

Keywords