BMC Medical Informatics and Decision Making (Nov 2023)

A hybrid stacked ensemble and Kernel SHAP-based model for intelligent cardiotocography classification and interpretability

  • Junyuan Feng,
  • Jincheng Liang,
  • Zihan Qiang,
  • Yuexing Hao,
  • Xia Li,
  • Li Li,
  • Qinqun Chen,
  • Guiqing Liu,
  • Hang Wei

DOI
https://doi.org/10.1186/s12911-023-02378-y
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background Intelligent cardiotocography (CTG) classification can assist obstetricians in evaluating fetal health. However, high classification performance is often achieved by complex machine learning (ML)-based models, which causes interpretability concerns. The trade-off between accuracy and interpretability makes it challenging for most existing ML-based CTG classification models to popularize in prenatal clinical applications. Methods Aiming to improve CTG classification performance and prediction interpretability, a hybrid model was proposed using a stacked ensemble strategy with mixed features and Kernel SHapley Additive exPlanations (SHAP) framework. Firstly, the stacked ensemble classifier was established by employing support vector machines (SVM), extreme gradient boosting (XGB), and random forests (RF) as base learners, and backpropagation (BP) as a meta learner whose input was mixed with the CTG features and the probability value of each category output by base learners. Then, the public and private CTG datasets were used to verify the discriminative performance. Furthermore, Kernel SHAP was applied to estimate the contribution values of features and their relationships to the fetal states. Results For intelligent CTG classification using 10-fold cross-validation, the accuracy and average F1 score were 0.9539 and 0.9249 in the public dataset, respectively; and those were 0.9201 and 0.8926 in the private dataset, respectively. For interpretability, the explanation results indicated that accelerations (AC) and the percentage of time with abnormal short-term variability (ASTV) were the key determinants. Specifically, the probability of abnormality increased and that of the normal state decreased as the value of ASTV grew. In addition, the likelihood of the normal status rose with the increase of AC. Conclusions The proposed model has high classification performance and reasonable interpretability for intelligent fetal monitoring.

Keywords