Cancer Medicine (Oct 2024)
Differential Radiomics‐Based Signature Predicts Lung Cancer Risk Accounting for Imaging Parameters in NLST Cohort
Abstract
ABSTRACT Objective Lung cancer remains the leading cause of cancer‐related mortality worldwide, with most cases diagnosed at advanced stages. Hence, there is a need to develop effective predictive models for early detection. This study aims to investigate the impact of imaging parameters and delta radiomic features from temporal scans on lung cancer risk prediction. Methods Using the National Lung Screening Trial (NLST) within a nested case–control study involving 462 positive screenings, radiomic features were extracted from temporal computed tomography (CT) scans and harmonized with ComBat method to adjust variations in slice thickness category (TC) and reconstruction kernel type (KT). Both harmonized and non‐harmonized features from baseline (T0), delta features between T0 and a year later (T1), and combined T0 and delta features were utilized for the analysis. Feature reduction was done using LASSO, followed by five feature selection (FS) methods and nine machine learning (ML) models, evaluated with 5‐fold cross‐validation repeated 10 times. Synthetic Minority Oversampling Technique (SMOTE) was applied to address class imbalances for lung cancer risk prediction. Results Models using delta features outperformed baseline features, with SMOTE consistently boosting performance when using combination of baseline and delta features. TC‐based harmonized features improved performance with SMOTE, but overall, harmonization did not significantly enhance the model performance. The highest test score of 0.76 was achieved in three scenarios: delta features with a Gradient Boosting (GB) model (TC‐based harmonization and MultiSurf FS); and T0 + delta features, with both a Support Vector Classifier (SVC) model (KT‐based harmonization and F‐test FS), and an XGBoost (XGB) model (TC‐based harmonization and Mutual Information (MI) FS), all using SMOTE. Conclusions This study underscores the significance of delta radiomic features and balanced datasets to improve lung cancer prediction. While our findings are based on a subsample of NLST data, they provide a valuable foundation for further exploration. Further research is needed to assess the impact of harmonization on imaging‐derived models. Future investigations should explore advanced harmonization techniques and additional imaging parameters to develop robust radiomics‐based biomarkers of lung cancer risk.
Keywords