IEEE Access (Jan 2023)
Recursive Visual Explanations Mediation Scheme Based on DropAttention Model With Multiple Episodes Pool
Abstract
In some DL applications such as remote sensing, it is hard to obtain the high task performance (e.g. accuracy) using the DL model on image analysis due to the low resolution characteristics of the imagery. Accordingly, several studies attempted to provide visual explanations or apply the attention mechanism to enhance the reliability on the image analysis. However, there still remains structural complexity on obtaining a sophisticated visual explanation with such existing methods: 1) which layer will the visual explanation be extracted from, and 2) which layers the attention modules will be applied to. 3) Subsequently, in order to observe the aspects of visual explanations on such diverse episodes of applying attention modules individually, training cost inefficiency inevitably arises as it requires training the multiple models one by one in the conventional methods. In order to solve the problems, we propose a new scheme of mediating the visual explanations in a pixel-level recursively. Specifically, we propose DropAtt that generates multiple episodes pool by training only a single network once as an amortized model, which also shows stability on task performance regardless of layer-wise attention policy. From the multiple episodes pool generated by DropAtt, by quantitatively evaluating the explainability of each visual explanation and expanding the parts of explanations with high explainability recursively, our visual explanations mediation scheme attempts to adjust how much to reflect each episodic layer-wise explanation for enforcing a dominant explainability of each candidate. On the empirical evaluation, our methods show their feasibility on enhancing the visual explainability by reducing average drop about 17% and enhancing the rate of increase in confidence 3%.
Keywords