Journal of Inflammation Research (Nov 2024)
Integrating Machine Learning and the SHapley Additive exPlanations (SHAP) Framework to Predict Lymph Node Metastasis in Gastric Cancer Patients Based on Inflammation Indices and Peripheral Lymphocyte Subpopulations
Abstract
Ziyu Zhu,1,* Cong Wang,1,* Lei Shi,2 Mengya Li,3 Jiaqi Li,3 Shiyin Liang,3 Zhidong Yin,1 Yingwei Xue1 1Department of Gastroenterological Surgery, Harbin Medical University Cancer Hospital, Harbin, People’s Republic of China; 2Department of Oncology, Beidahuang Industry Group General Hospital, Harbin, People’s Republic of China; 3Key Laboratory of Preservation of Genetic Resources and Disease Control in China, Harbin Medical University, Harbin, People’s Republic of China*These authors contributed equally to this workCorrespondence: Yingwei Xue; Zhidong Yin, Email [email protected]; [email protected]: The prediction of lymph node metastasis in gastric cancer, a pivotal determinant affecting treatment approaches and prognosis, continues to pose a significant challenge in terms of accuracy.Methods: In this study, we employed a combination of machine learning methods and the SHapley Additive exPlanations (SHAP) framework to develop an integrated predictive model. This model utilizes the preoperatively obtainable parameter of the inflammatory index, aiming to enhance the accuracy of predicting lymph node metastasis in gastric cancer patients.Results: Lymph node metastasis stands as an independent prognostic risk factor for gastric cancer patients. Among various models, XGBoost emerges as the optimal machine learning model. In the training set, the XGBoost model exhibited the highest AUC value of 0.705. In the test set, XGBoost demonstrated the highest AUC of 0.695, and the lowest Brier score of 0.218. Notably, in terms of feature importance, PLR emerged as the most significant factor influencing lymph node metastasis in gastric cancer patients. Through the screening of differentially expressed genes, we ultimately identified the prognostic value of six genes: IGFN1, CLEC11A, STC2, TFEC, MUC5AC, and ANOS1, in predicting survival.Conclusion: The XGBoost model can predict lymph node metastasis (LNM) in gastric cancer patients based on the inflammation index and peripheral lymphocyte subgroups. Combined with SHAP, it provides a more intuitive reflection of the impact of different variables on LNM. PLR emerges as the most crucial risk factor for lymph node metastasis in the inflammation index among gastric cancer patients.Keywords: Machine Learning, SHAP Framework, Lymph Node Metastasis, Gastric Cancer, Inflammation Indices, Peripheral Lymphocyte Subpopulations