An Improved Model for Analyzing Textual Sentiment Based on a Deep Neural Network Using Multi-Head Attention Mechanism

Hashem Saleh Sharaf Al-deen; Zhiwen Zeng; Raeed Al-sabri; Arash Hekmat

doi:10.3390/asi4040085

Applied System Innovation (Oct 2021)

An Improved Model for Analyzing Textual Sentiment Based on a Deep Neural Network Using Multi-Head Attention Mechanism

Hashem Saleh Sharaf Al-deen,
Zhiwen Zeng,
Raeed Al-sabri,
Arash Hekmat

Affiliations

Hashem Saleh Sharaf Al-deen: School of Computer Science and Engineering, Central South University, Changsha 410083, China
Zhiwen Zeng: School of Computer Science and Engineering, Central South University, Changsha 410083, China
Raeed Al-sabri: School of Computer Science and Engineering, Central South University, Changsha 410083, China
Arash Hekmat: School of Computer Science and Engineering, Central South University, Changsha 410083, China

DOI: https://doi.org/10.3390/asi4040085
Journal volume & issue: Vol. 4, no. 4
p. 85

Abstract

Read online

Due to the increasing growth of social media content on websites such as Twitter and Facebook, analyzing textual sentiment has become a challenging task. Therefore, many studies have focused on textual sentiment analysis. Recently, deep learning models, such as convolutional neural networks and long short-term memory, have achieved promising performance in sentiment analysis. These models have proven their ability to cope with the arbitrary length of sequences. However, when they are used in the feature extraction layer, the feature distance is highly dimensional, the text data are sparse, and they assign equal importance to various features. To address these issues, we propose a hybrid model that combines a deep neural network with a multi-head attention mechanism (DNN–MHAT). In the DNN–MHAT model, we first design an improved deep neural network to capture the text’s actual context and extract the local features of position invariants by combining recurrent bidirectional long short-term memory units (Bi-LSTM) with a convolutional neural network (CNN). Second, we present a multi-head attention mechanism to capture the words in the text that are significantly related to long space and encoding dependencies, which adds a different focus to the information outputted from the hidden layers of BiLSTM. Finally, a global average pooling is applied for transforming the vector into a high-level sentiment representation to avoid model overfitting, and a sigmoid classifier is applied to carry out the sentiment polarity classification of texts. The DNN–MHAT model is tested on four reviews and two Twitter datasets. The results of the experiments illustrate the effectiveness of the DNN–MHAT model, which achieved excellent performance compared to the state-of-the-art baseline methods based on short tweets and long reviews.

Published in Applied System Innovation

ISSN: 2571-5577 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Applied mathematics. Quantitative methods
Website: https://www.mdpi.com/journal/asi

About the journal

Abstract

Keywords