Scientific Reports (Jul 2024)
Explainable artificial intelligence (XAI) for predicting the need for intubation in methanol-poisoned patients: a study comparing deep and machine learning models
Abstract
Abstract The need for intubation in methanol-poisoned patients, if not predicted in time, can lead to irreparable complications and even death. Artificial intelligence (AI) techniques like machine learning (ML) and deep learning (DL) greatly aid in accurately predicting intubation needs for methanol-poisoned patients. So, our study aims to assess Explainable Artificial Intelligence (XAI) for predicting intubation necessity in methanol-poisoned patients, comparing deep learning and machine learning models. This study analyzed a dataset of 897 patient records from Loghman Hakim Hospital in Tehran, Iran, encompassing cases of methanol poisoning, including those requiring intubation (202 cases) and those not requiring it (695 cases). Eight established ML (SVM, XGB, DT, RF) and DL (DNN, FNN, LSTM, CNN) models were used. Techniques such as tenfold cross-validation and hyperparameter tuning were applied to prevent overfitting. The study also focused on interpretability through SHAP and LIME methods. Model performance was evaluated based on accuracy, specificity, sensitivity, F1-score, and ROC curve metrics. Among DL models, LSTM showed superior performance in accuracy (94.0%), sensitivity (99.0%), specificity (94.0%), and F1-score (97.0%). CNN led in ROC with 78.0%. For ML models, RF excelled in accuracy (97.0%) and specificity (100%), followed by XGB with sensitivity (99.37%), F1-score (98.27%), and ROC (96.08%). Overall, RF and XGB outperformed other models, with accuracy (97.0%) and specificity (100%) for RF, and sensitivity (99.37%), F1-score (98.27%), and ROC (96.08%) for XGB. ML models surpassed DL models across all metrics, with accuracies from 93.0% to 97.0% for DL and 93.0% to 99.0% for ML. Sensitivities ranged from 98.0% to 99.37% for DL and 93.0% to 99.0% for ML. DL models achieved specificities from 78.0% to 94.0%, while ML models ranged from 93.0% to 100%. F1-scores for DL were between 93.0% and 97.0%, and for ML between 96.0% and 98.27%. DL models scored ROC between 68.0% and 78.0%, while ML models ranged from 84.0% to 96.08%. Key features for predicting intubation necessity include GCS at admission, ICU admission, age, longer folic acid therapy duration, elevated BUN and AST levels, VBG_HCO3 at initial record, and hemodialysis presence. This study as the showcases XAI's effectiveness in predicting intubation necessity in methanol-poisoned patients. ML models, particularly RF and XGB, outperform DL counterparts, underscoring their potential for clinical decision-making.
Keywords