Jisuanji kexue (May 2022)
Interpretability Optimization Method Based on Mutual Transfer of Local Attention Map
Abstract
At present,deep learning models have been widely deployed in various industrial fields.However,the complexity and inexplicability of deep learning model have become the main bottleneck of its application in high-risk fields.The most important method is the visual interpretation,in which the attention map is the main representation of the visual interpretation method.The decision area in the sample image can be marked to visually display the decision basis of the model.In the existing visual interpretation methods based on attention map,the single model attention map has the problem of insufficient confidence of visualization interpretability due to the annotation error easily appearing in the annotated region.To solve the above problems,this paper proposes an interpretable optimization method based on the mutual transfer of local attention map,aiming at improving the annotation accuracy of the model attention map and displaying the precise decision area,so as to strengthen the visual interpretable ability for the model decision basis.Specifically,the structure of the intermigration network is constructed by using the lightweight model,the feature maps are extracted and superimposed between the layers of the single model,and the global attention map is divided locally.Pearson correlation coefficient is used to measure the similarity of the corresponding local attention map between the models,and then the local attention map is regularized and transferred combined with the cross-entropy function..Experimental results show that the proposed algorithm significantly improves the accuracy of the model attention map label accuracy.The proposed algorithm achieves an average drop rate of 28.2% and an average increase rate of 29.5%,respectively,and achieves an increase of 3.3% in the average decline rate compared with the most advanced algorithm.The above experiments show that the proposed algorithm can successfully find out the most responsive region in the sample image,rather than being limited to the vi-sual visualization region.Compared with the existing similar methods,the proposed method can more accurately reveal the decision basis of the original CNN model.
Keywords