Journal of Hepatocellular Carcinoma (Jul 2022)

A Machine Learning Model Based on Health Records for Predicting Recurrence After Microwave Ablation of Hepatocellular Carcinoma

  • An C,
  • Yang H,
  • Yu X,
  • Han Z,
  • Cheng Z,
  • Liu F,
  • Dou J,
  • Li B,
  • Li Y,
  • Li Y,
  • Yu J,
  • Liang P

Journal volume & issue
Vol. Volume 9
pp. 671 – 684

Abstract

Read online

Chao An,1,* Hongcai Yang,1,2,* Xiaoling Yu,1 Zhi-Yu Han,1 Zhigang Cheng,1 Fangyi Liu,1 Jianping Dou,1 Bing Li,3 Yansheng Li,4 Yichao Li,4 Jie Yu,1 Ping Liang1 1Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China; 2School of Medicine, Nankai University, Tianjin, People’s Republic of China; 3National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing, People’s Republic of China; 4DHC Mediway Technology CO, Ltd, Beijing, People’s Republic of China*These authors contributed equally to this workCorrespondence: Ping Liang; Jie Yu, Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China, Tel +86-10-66939530, Fax +86-10-68161218, Email [email protected]; [email protected] and Aim: Early recurrence (ER) presents a challenge for the survival prognosis of patients with hepatocellular carcinoma (HCC). The aim of this study was to investigate machine learning (ML) models using clinical data for predicting ER after microwave ablation (MWA).Methods: Between August 2005 and December 2019, 1574 patients with early-stage HCC underwent MWA at four hospitals were reviewed. Then, 36 clinical data points per patient were collected, and the patients were assigned to the training, internal, and external validation set. Apart from traditional logistic regression (LR), three ML models—random forest, support vector machine, and eXtreme Gradient Boosting (XGBoost)—were built and validated for their predictive ability with the area under ROC curve (AUC). Algorithms such as SHapley Additive exPlanations (SHAP) and local interpretable model-agnostic explanations (LIME) were used to realize their interpretability.Results: The three ML models all outperformed LR (P < 0.001 for all) in predictive ability. When nine variables (tumor number, platelet, α-fetoprotein, comorbidity score, white blood cell, cholinesterase, prothrombin time, neutrophils, and etiology) were extracted simultaneously using recursive feature elimination with cross-validation, the XGBoost model achieved the best discrimination among all models, with an AUC value 0.75 (95% CI [confidence interval]: 0.72– 0.78) in the training set, 0.74 (95% CI: 0.69– 0.80) in the internal validation set, and 0.76 (95% CI: 0.70– 0.82) in the external validation set, and it was interpreted depending on the visualization of risk factors by the SHAP and LIME algorithms. The predictive system of post-ablation recurrence risk stratification was provided on online (http://114.251.235.51:8001/) based on XGboost analysis.Conclusion: The XGBoost model based on clinical data can effectively predict ER risk after MWA, which can contribute to surveillance, prevention, and treatment strategies for HCC.Keywords: microwave ablation, hepatocellular carcinoma, recurrence, machine learning, risk stratification

Keywords