Future Internet (Dec 2020)

An Analysis Method for Interpretability of CNN Text Classification Model

  • Peng Ce,
  • Bao Tie

DOI
https://doi.org/10.3390/fi12120228
Journal volume & issue
Vol. 12, no. 12
p. 228

Abstract

Read online

With continuous development of artificial intelligence, text classification has gradually changed from a knowledge-based method to a method based on statistics and machine learning. Among them, it is a very important and efficient way to classify text based on the convolutional neural network (CNN) model. Text data are a kind of sequence data, while time sequentiality of the general text data is relatively weak, so text classification is usually less relevant to the sequential structure of the full text. Therefore, CNN-based text classification has gradually become a research hotspot when dealing with issues of text classification. For machine learning, especially deep learning, model interpretability has increasingly become the focus of academic research and industrial applications, and also become a key issue for further development and application of deep learning technology. Therefore, we recommend using the backtracking analysis method to conduct in-depth research on deep learning models. This paper proposes an analysis method for interpretability of a CNN text classification model. The method proposed by us can perform multi-angle analysis on the discriminant results of multi-classified text and multi-label classification tasks through backtracking analysis on model prediction results. Finally, the analysis results of the model can be displayed using visualization technology from multiple dimensions based on interpretability. The representative data set IMDB (Internet Movie Database) in text classification is verified by examples, and the results show that the model can be effectively analyzed when using our method.

Keywords