An Analysis Method for Interpretability of CNN Text Classification Model

Peng Ce; Bao Tie

doi:10.3390/fi12120228

Future Internet (Dec 2020)

An Analysis Method for Interpretability of CNN Text Classification Model

Peng Ce,
Bao Tie

Affiliations

Peng Ce: School of Computer Science and Technology, Jilin University, Changchun 130012, China
Bao Tie: School of Computer Science and Technology, Jilin University, Changchun 130012, China

DOI: https://doi.org/10.3390/fi12120228
Journal volume & issue: Vol. 12, no. 12
p. 228

Abstract

Read online

With continuous development of artificial intelligence, text classification has gradually changed from a knowledge-based method to a method based on statistics and machine learning. Among them, it is a very important and efficient way to classify text based on the convolutional neural network (CNN) model. Text data are a kind of sequence data, while time sequentiality of the general text data is relatively weak, so text classification is usually less relevant to the sequential structure of the full text. Therefore, CNN-based text classification has gradually become a research hotspot when dealing with issues of text classification. For machine learning, especially deep learning, model interpretability has increasingly become the focus of academic research and industrial applications, and also become a key issue for further development and application of deep learning technology. Therefore, we recommend using the backtracking analysis method to conduct in-depth research on deep learning models. This paper proposes an analysis method for interpretability of a CNN text classification model. The method proposed by us can perform multi-angle analysis on the discriminant results of multi-classified text and multi-label classification tasks through backtracking analysis on model prediction results. Finally, the analysis results of the model can be displayed using visualization technology from multiple dimensions based on interpretability. The representative data set IMDB (Internet Movie Database) in text classification is verified by examples, and the results show that the model can be effectively analyzed when using our method.

Published in Future Internet

ISSN: 1999-5903 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/futureinternet/

About the journal

Abstract

Keywords