Reviews in Cardiovascular Medicine (Sep 2023)
Clinical Validation of Explainable Deep Learning Model for Predicting the Mortality of In-Hospital Cardiac Arrest Using Diagnosis Codes of Electronic Health Records
Abstract
Background: Using deep learning for disease outcome prediction is an approach that has made large advances in recent years. Notwithstanding its excellent performance, clinicians are also interested in learning how input affects prediction. Clinical validation of explainable deep learning models is also as yet unexplored. This study aims to evaluate the performance of Deep SHapley Additive exPlanations (D-SHAP) model in accurately identifying the diagnosis code associated with the highest mortality risk. Methods: Incidences of at least one in-hospital cardiac arrest (IHCA) for 168,693 patients as well as 1,569,478 clinical records were extracted from Taiwan’s National Health Insurance Research Database. We propose a D-SHAP model to provide insights into deep learning model predictions. We trained a deep learning model to predict the 30-day mortality likelihoods of IHCA patients and used D-SHAP to see how the diagnosis codes affected the model’s predictions. Physicians were asked to annotate a cardiac arrest dataset and provide expert opinions, which we used to validate our proposed method. A 1-to-4-point annotation of each record (current decision) along with four previous records (historical decision) was used to validate the current and historical D-SHAP values. Results: A subset consisting of 402 patients with at least one cardiac arrest record was randomly selected from the IHCA cohort. The median age was 72 years, with mean and standard deviation of 69 ± 17 years. Results indicated that D-SHAP can identify the cause of mortality based on the diagnosis codes. The top five most important diagnosis codes, namely respiratory failure, sepsis, pneumonia, shock, and acute kidney injury were consistent with the physician’s opinion. Some diagnoses, such as urinary tract infection, showed a discrepancy between D-SHAP and clinical judgment due to the lower frequency of the disease and its occurrence in combination with other comorbidities. Conclusions: The D-SHAP framework was found to be an effective tool to explain deep neural networks and identify most of the important diagnoses for predicting patients’ 30-day mortality. However, physicians should always carefully consider the structure of the original database and underlying pathophysiology.
Keywords