Cancer Management and Research (May 2024)

Machine Learning for Prediction of Non-Small Cell Lung Cancer Based on Inflammatory and Nutritional Indicators in Adults: A Cross-Sectional Study

  • Wang Q,
  • Liang T,
  • Li Y,
  • Liu X

Journal volume & issue
Vol. Volume 16
pp. 527 – 535

Abstract

Read online

Qiaoli Wang,1,* Tao Liang,2,* Yuexi Li,1,* Xiaoqin Liu1 1Department of Health Screening Center, Deyang Peoples’ Hospital, Deyang, Sichuan, 618000, People’s Republic of China; 2Department of Gastroenterology, Deyang Peoples’ Hospital, Deyang, Sichuan, 618000, People’s Republic of China*These authors contributed equally to this workCorrespondence: Xiaoqin Liu, Email [email protected]: The aim of this study was to evaluate the potential benefit of blood inflammation in the diagnosis of non-small cell lung cancer (NSCLC) and propose a machine-learning-based method to predict NSCLC in asymptomatic adults.Patients and Methods: A cross-sectional study was evaluated using medical records of 139 patients with non-small cell lung cancer and physical examination data from May 2022 to May 2023 of 198 healthy controls. The NSCLC cohort comprised 128 cases of adenocarcinoma, 3 cases of squamous cell carcinoma, and 8 cases of other NSCLC subtypes. The correlation between inflammatory and nutritional markers, such as monocytes, neutrophils, LMR, NLR, PLR, PHR and non-small cell lung cancer was examined. Features were selected using Python’s feature selection library and analyzed by five algorithms. The predictive ability of the model for non-small cell lung cancer diagnosis was assessed by precision, accuracy, recall, F1 score, and area under the curve (AUC).Results: The results showed that the top 14 important factors were PDW, age, TP, RBC, HGB, LYM, LYM%, RDW, PLR, LMR, PHR, MONO, MONO%, gender. Additionally, the naive Bayes (NB) algorithm demonstrated the highest overall performance in predicting adult NSCLC among the five machine learning algorithms, achieving an accuracy of 0.87, a macro average F1 score of 0.85, a weighted average F1 score of 0.87, and an AUC of 0.84.Conclusion: In feature ranking, platelet distribution width was the most important feature, and the NB algorithm performed best in predicting adult NSCLC diagnosis.Keywords: machine learning, non-small cell lung cancer, inflammatory indicators, nutritional indicators, ratio, diagnosis

Keywords