BMC Medicine (Sep 2023)
The influence of explainable vs non-explainable clinical decision support systems on rapid triage decisions: a mixed methods study
Abstract
Abstract Background During the COVID-19 pandemic, a variety of clinical decision support systems (CDSS) were developed to aid patient triage. However, research focusing on the interaction between decision support systems and human experts is lacking. Methods Thirty-two physicians were recruited to rate the survival probability of 59 critically ill patients by means of chart review. Subsequently, one of two artificial intelligence systems advised the physician of a computed survival probability. However, only one of these systems explained the reasons behind its decision-making. In the third step, physicians reviewed the chart once again to determine the final survival probability rating. We hypothesized that an explaining system would exhibit a higher impact on the physicians’ second rating (i.e., higher weight-on-advice). Results The survival probability rating given by the physician after receiving advice from the clinical decision support system was a median of 4 percentage points closer to the advice than the initial rating. Weight-on-advice was not significantly different (p = 0.115) between the two systems (with vs without explanation for its decision). Additionally, weight-on-advice showed no difference according to time of day or between board-qualified and not yet board-qualified physicians. Self-reported post-experiment overall trust was awarded a median of 4 out of 10 points. When asked after the conclusion of the experiment, overall trust was 5.5/10 (non-explaining median 4 (IQR 3.5–5.5), explaining median 7 (IQR 5.5–7.5), p = 0.007). Conclusions Although overall trust in the models was low, the median (IQR) weight-on-advice was high (0.33 (0.0–0.56)) and in line with published literature on expert advice. In contrast to the hypothesis, weight-on-advice was comparable between the explaining and non-explaining systems. In 30% of cases, weight-on-advice was 0, meaning the physician did not change their rating. The median of the remaining weight-on-advice values was 50%, suggesting that physicians either dismissed the recommendation or employed a “meeting halfway” approach. Newer technologies, such as clinical reasoning systems, may be able to augment the decision process rather than simply presenting unexplained bias.
Keywords