Empowering Urdu sentiment analysis: an attention-based stacked CNN-Bi-LSTM DNN with multilingual BERT

Lal Khan; Atika Qazi; Hsien-Tsung Chang; Mousa Alhajlah; Awais Mahmood

doi:10.1007/s40747-024-01631-9

Complex & Intelligent Systems (Nov 2024)

Empowering Urdu sentiment analysis: an attention-based stacked CNN-Bi-LSTM DNN with multilingual BERT

Lal Khan,
Atika Qazi,
Hsien-Tsung Chang,
Mousa Alhajlah,
Awais Mahmood

Affiliations

Lal Khan: Department of Computer Science, IBADAT International University Islamabad
Atika Qazi: Centre for Lifelong Learning, Universiti Brunei Darussalam
Hsien-Tsung Chang: Bachelor Program in Artificial Intelligence, Chang Gung University
Mousa Alhajlah: Computer Science and Information Systems Department, Applied Computer Science College, King Saud University
Awais Mahmood: Computer Science and Information Systems Department, Applied Computer Science College, King Saud University

DOI: https://doi.org/10.1007/s40747-024-01631-9
Journal volume & issue: Vol. 11, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Sentiment analysis (SA) as a research field has gained popularity among the researcher throughout the globe over the past 10 years. Deep neural networks (DNN) and word vector models are employed nowadays and perform well in sentiment analysis. Among the different deep neural networks utilized for SA globally, Bi-directional long short-term memory (Bi-LSTM), BERT, and CNN models have received much attention. Even though these models can process a wide range of text types, Because DNNs treat different features the same, using these models in the feature learning phase of a DNN model leads to the creation of a feature space with very high dimensionality. We suggest an attention-based, stacked, two-layer CNN-Bi-LSTM DNN to overcome these glitches. After local feature extraction, by applying stacked two-layer Bi-LSTM, our proposed model extracted coming and outgoing sequences by seeing sequential data streams in backward and forward directions. The output of the stacked two-layer Bi-LSTM is supplied to the attention layer to assign various words with varying values. A second Bi-LSTM layer is constructed atop the initial layer in the suggested network to increase performance. Various experiments have been conducted to evaluate the effectiveness of our proposed model on two Urdu sentiment analysis datasets named as UCSA-21 and UCSA, and an accuracies of 83.12% and 78.91% achieved, respectively.

Published in Complex & Intelligent Systems

ISSN: 2199-4536 (Print); 2198-6053 (Online)
Publisher: Springer
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.springer.com/journal/40747

About the journal

Abstract

Keywords