Chinese Medical Journal (Aug 2024)

Machine-learning-based models assist the prediction of pulmonary embolism in autoimmune diseases: A retrospective, multicenter study

  • Ziwei Hu,
  • Yangyang Hu,
  • Shuoqi Zhang,
  • Li Dong,
  • Xiaoqi Chen,
  • Huiqin Yang,
  • Linchong Su,
  • Xiaoqiang Hou,
  • Xia Huang,
  • Xiaolan Shen,
  • Cong Ye,
  • Wei Tu,
  • Yu Chen,
  • Yuxue Chen,
  • Shaozhe Cai,
  • Jixin Zhong,
  • Lingli Dong,
  • Lishao Guo

DOI
https://doi.org/10.1097/CM9.0000000000003025
Journal volume & issue
Vol. 137, no. 15
pp. 1811 – 1822

Abstract

Read online

Abstract. Background:. Pulmonary embolism (PE) is a severe and acute cardiovascular syndrome with high mortality among patients with autoimmune inflammatory rheumatic diseases (AIIRDs). Accurate prediction and timely intervention play a pivotal role in enhancing survival rates. However, there is a notable scarcity of practical early prediction and risk assessment systems of PE in patients with AIIRD. Methods:. In the training cohort, 60 AIIRD with PE cases and 180 age-, gender-, and disease-matched AIIRD non-PE cases were identified from 7254 AIIRD cases in Tongji Hospital from 2014 to 2022. Univariable logistic regression (LR) and least absolute shrinkage and selection operator (LASSO) were used to select the clinical features for further training with machine learning (ML) methods, including random forest (RF), support vector machines (SVM), neural network (NN), logistic regression (LR), gradient boosted decision tree (GBDT), classification and regression trees (CART), and C5.0 models. The performances of these models were subsequently validated using a multicenter validation cohort. Results:. In the training cohort, 24 and 13 clinical features were selected by univariable LR and LASSO strategies, respectively. The five ML models (RF, SVM, NN, LR, and GBDT) showed promising performances, with an area under the receiver operating characteristic (ROC) curve (AUC) of 0.962–1.000 in the training cohort and 0.969–0.999 in the validation cohort. CART and C5.0 models achieved AUCs of 0.850 and 0.932, respectively, in the training cohort. Using D-dimer as a pre-screening index, the refined C5.0 model achieved an AUC exceeding 0.948 in the training cohort and an AUC above 0.925 in the validation cohort. These results markedly outperformed the use of D-dimer levels alone. Conclusion:. ML-based models are proven to be precise for predicting the onset of PE in patients with AIIRD exhibiting clinical suspicion of PE. Trial Registration:. Chictr.org.cn: ChiCTR2200059599.