Machine learning predicts cancer-associated venous thromboembolism using clinically available variables in gastric cancer patients
Qianjie Xu,
Haike Lei,
Xiaosheng Li,
Fang Li,
Hao Shi,
Guixue Wang,
Anlong Sun,
Ying Wang,
Bin Peng
Affiliations
Qianjie Xu
Department of Health Statistics, School of Public Health, Chongqing Medical University, Chongqing, 400016, China
Haike Lei
Chongqing Cancer Multi-omics Big Data Application Engineering Research Center, Chongqing University Cancer Hospital, Chongqing, 400030, China
Xiaosheng Li
Chongqing Cancer Multi-omics Big Data Application Engineering Research Center, Chongqing University Cancer Hospital, Chongqing, 400030, China
Fang Li
Chongqing Cancer Multi-omics Big Data Application Engineering Research Center, Chongqing University Cancer Hospital, Chongqing, 400030, China
Hao Shi
Chongqing Cancer Multi-omics Big Data Application Engineering Research Center, Chongqing University Cancer Hospital, Chongqing, 400030, China
Guixue Wang
MOE Key Lab for Biorheological Science and Technology, State and Local Joint Engineering Laboratory for Vascular Implants, College of Bioengineering Chongqing University, Chongqing, 400030, China
Anlong Sun
Chongqing Cancer Multi-omics Big Data Application Engineering Research Center, Chongqing University Cancer Hospital, Chongqing, 400030, China; Corresponding author.
Ying Wang
Chongqing Cancer Multi-omics Big Data Application Engineering Research Center, Chongqing University Cancer Hospital, Chongqing, 400030, China; Corresponding author.
Bin Peng
Department of Health Statistics, School of Public Health, Chongqing Medical University, Chongqing, 400016, China; Corresponding author.
Stomach cancer (GC) has one of the highest rates of thrombosis among cancers and can lead to considerable morbidity, mortality, and additional costs. However, to date, there is no suitable venous thromboembolism (VTE) prediction model for gastric cancer patients to predict risk. Therefore, there is an urgent need to establish a clinical prediction model for VTE in gastric cancer patients. We collected data on 3092 patients between January 1, 2018 and December 31, 2021. And after feature selection, 11 variables are reserved as predictors to build the model. Five machine learning (ML) algorithms are used to build different VTE predictive models. The accuracy, sensitivity, specificity, and AUC of these five models were compared with traditional logistic regression (LR) to recommend the best VTE prediction model. RF and XGB models have selected the essential characters in the model: Clinical stage, Blood Transfusion History, D-Dimer, AGE, and FDP. The model has an AUC of 0.825, an accuracy of 0.799, a sensitivity of 0.710, and a specificity of 0.802 in the validation set. The model has good performance and high application value in clinical practice, and can identify high-risk groups of gastric cancer patients and prevent venous thromboembolism.