Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based)

Chanthol Eang; Seungjae Lee

doi:10.3390/app14188388

Applied Sciences (Sep 2024)

Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based)

Chanthol Eang,
Seungjae Lee

Affiliations

Chanthol Eang: Department of Computer Science and Engineering, Intelligent Robot Research Institute, Sun Moon University, Asan 31460, Republic of Korea
Seungjae Lee: Department of Computer Science and Engineering, Intelligent Robot Research Institute, Sun Moon University, Asan 31460, Republic of Korea

DOI: https://doi.org/10.3390/app14188388
Journal volume & issue: Vol. 14, no. 18
p. 8388

Abstract

Read online

This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This dataset consists of movie review sentences, each labeled with either positive or negative sentiment, making it a binary classification task. Recurrent Neural Networks (RNNs) are effective for text classification because they capture the sequential nature of language, which is crucial for understanding context and meaning. Bert excels in text classification by providing bidirectional context, generating contextual embeddings, and leveraging pre-training on large corpora. This allows Bert to capture nuanced meanings and relationships within the text effectively. Combining Bert with RNNs can be highly effective for text classification. Bert’s bidirectional context and rich embeddings provide a deep understanding of the text, while RNNs capture sequential patterns and long-range dependencies. Together, they leverage the strengths of both architectures, leading to improved performance on complex classification tasks. Next, we also developed an integration of the Bert model and a K-Nearest Neighbor based (KNN_Bert_based) method as a comparative scheme for our proposed work. Based on the results of experimentation, our proposed model outperforms traditional text classification models as well as existing models in terms of accuracy.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords