Enhancing the Generalization for Text Classification through Fusion of Backward Features

Dewen Seng; Xin Wu

doi:10.3390/s23031287

Sensors (Jan 2023)

Enhancing the Generalization for Text Classification through Fusion of Backward Features

Dewen Seng,
Xin Wu

Affiliations

Dewen Seng: School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310005, China
Xin Wu: School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310005, China

DOI: https://doi.org/10.3390/s23031287
Journal volume & issue: Vol. 23, no. 3
p. 1287

Abstract

Read online

Generalization has always been a keyword in deep learning. Pretrained models and domain adaptation technology have received widespread attention in solving the problem of generalization. They are all focused on finding features in data to improve the generalization ability and to prevent overfitting. Although they have achieved good results in various tasks, those models are unstable when classifying a sentence whose label is positive but still contains negative phrases. In this article, we analyzed the attention heat map of the benchmarks and found that previous models pay more attention to the phrase rather than to the semantic information of the whole sentence. Moreover, we proposed a method to scatter the attention away from opposite sentiment words to avoid a one-sided judgment. We designed a two-stream network and stacked the gradient reversal layer and feature projection layer within the auxiliary network. The gradient reversal layer can reverse the gradient of features in the training stage so that the parameters are optimized following the reversed gradient in the backpropagation stage. We utilized an auxiliary network to extract the backward features and then fed them into the main network to merge them with normal features extracted by the main network. We applied this method to the three baselines of TextCNN, BERT, and RoBERTa using sentiment analysis and sarcasm detection datasets. The results show that our method can improve the sentiment analysis datasets by 0.5% and the sarcasm detection datasets by 2.1%.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords