An efficient method for disaster tweets classification using gradient-based optimized convolutional neural networks with BERT embeddings

Deepak Dharrao; Aadithyanarayanan MR; Rewaa Mital; Abhinav Vengali; Madhuri Pangavhane; Satpalsing Rajput; Anupkumar M. Bongale

MethodsX (Dec 2024)

An efficient method for disaster tweets classification using gradient-based optimized convolutional neural networks with BERT embeddings

Deepak Dharrao,
Aadithyanarayanan MR,
Rewaa Mital,
Abhinav Vengali,
Madhuri Pangavhane,
Satpalsing Rajput,
Anupkumar M. Bongale

Affiliations

Deepak Dharrao: Department of Computer Science and Engineering, Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India
Aadithyanarayanan MR: Department of Computer Science and Engineering, Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India
Rewaa Mital: Department of Computer Science and Engineering, Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India
Abhinav Vengali: Department of Computer Science and Engineering, Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India
Madhuri Pangavhane: Department of Computer Science and Engineering, Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India
Satpalsing Rajput: Department of Computer Engineering, Vishwakarma Institute of Technology, Pune, India
Anupkumar M. Bongale: Department of Artificial Intelligence and Machine learning, Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, India; Corresponding author.

Journal volume & issue: Vol. 13
p. 102843

Abstract

Read online

Event of the disastrous scenarios are actively discussed on microblogging platforms like Twitter which can lead to chaotic situations. In the era of machine learning and deep learning, these chaotic situations can be effectively controlled by developing efficient methods and models that can assist in classifying real and fake tweets. In this research article, an efficient method named BERT Embedding based CNN model with RMSProp Optimizer is proposed to effectively classify the tweets related disastrous scenario. Tweet classification is carried out via some of the popular the machine learning algorithms such as logistic regression and decision tree classifiers. Noting the low accuracy of machine learning models, Convolutional Neural Network (CNN) based deep learning model is selected as the primary classification method. CNNs performance is improved via optimization of the parameters with gradient based optimizers. To further elevate accuracy and to capture contextual semantics from the text data, BERT embeddings are included in the proposed model. The performance of proposed method - BERT Embedding based CNN model with RMSProp Optimizer achieved an F1 score of 0.80 and an Accuracy of 0.83. The methodology presented in this research article is comprised of the following key contributions: • Identification of suitable text classification model that can effectively capture complex patterns when dealing with large vocabularies or nuanced language structures in disaster management scenarios. • The method explores the gradient based optimization techniques such as Adam Optimizer, Stochastic Gradient Descent (SGD) Optimizer, AdaGrad, and RMSprop Optimizer to identify the most appropriate optimizer that meets the characteristics of the dataset and the CNN model architecture. • “BERT Embedding based CNN model with RMSProp Optimizer” – a method to classify the disaster tweets and capture semantic representations by leveraging BERT embeddings with appropriate feature selection is presented and models are validated with appropriate comparative analysis.

Disaster Tweet classification using CNN with BERT embeddings and RMS-Prop Optimization an Efficient Method for Disaster Tweets Classification using Gradient-Based Optimized Convolutional Neural Networks with BERT embeddings

Published in MethodsX

ISSN: 2215-0161 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Science
Website: http://www.journals.elsevier.com/methodsx/

About the journal

Abstract

Keywords