Thrombosis Journal (Aug 2024)

Identification of key risk factors for venous thromboembolism in urological inpatients based on the Caprini scale and interpretable machine learning methods

  • Chao Liu,
  • Wei-Ying Yang,
  • Fengmin Cheng,
  • Ching-Wen Chien,
  • Yen-Ching Chuang,
  • Yanjun Jin

DOI
https://doi.org/10.1186/s12959-024-00645-0
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Purpose To identify the key risk factors for venous thromboembolism (VTE) in urological inpatients based on the Caprini scale using an interpretable machine learning method. Methods VTE risk data of urological inpatients were obtained based on the Caprini scale in the case hospital. Based on the data, the Boruta method was used to further select the key variables from the 37 variables in the Caprini scale. Furthermore, decision rules corresponding to each risk level were generated using the rough set (RS) method. Finally, random forest (RF), support vector machine (SVM), and backpropagation artificial neural network (BPANN) were used to verify the data accuracy and were compared with the RS method. Results Following the screening, the key risk factors for VTE in urology were “(C 1) Age,” “(C 2) Minor Surgery planned,” “(C 3) Obesity (BMI > 25),” “(C 8) Varicose veins,” “(C 9) Sepsis ( 45 min),” “(C 19) Laparoscopic surgery (> 45 min),” “(C 20) Patient confined to bed (> 72 h),” “(C18) Malignancy (present or previous),” “(C 23) Central venous access,” “(C 31) History of DVT/PE,” “(C 32) Other congenital or acquired thrombophilia,” and “(C 34) Stroke ( 45 minutes),” and “(C 21) Malignancy (present or previous)” were the main factors influencing mid- and high-risk levels, and some suggestions on VTE prevention were indicated based on these three factors. The average accuracies of the RS, RF, SVM, and BPANN models were 79.5%, 87.9%, 92.6%, and 97.2%, respectively. In addition, BPANN had the highest accuracy, recall, F1-score, and precision. Conclusions The RS model achieved poorer accuracy than the other three common machine learning models. However, the RS model provides strong interpretability and allows for the identification of high-risk factors and decision rules influencing high-risk assessments of VTE in urology. This transparency is very important for clinicians in the risk assessment process.

Keywords