Leveraging machine learning to analyze sentiment from COVID‐19 tweets: A global perspective

Md Mahbubar Rahman; Nafiz Imtiaz Khan; Iqbal H. Sarker; Mohiuddin Ahmed; Muhammad Nazrul Islam

doi:10.1002/eng2.12572

Engineering Reports (Mar 2023)

Leveraging machine learning to analyze sentiment from COVID‐19 tweets: A global perspective

Md Mahbubar Rahman,
Nafiz Imtiaz Khan,
Iqbal H. Sarker,
Mohiuddin Ahmed,
Muhammad Nazrul Islam

Affiliations

Md Mahbubar Rahman: Department of Computer Science and Engineering Military Institute of Science and Technology (MIST) Dhaka Bangladesh
Nafiz Imtiaz Khan: Department of Computer Science and Engineering Military Institute of Science and Technology (MIST) Dhaka Bangladesh
Iqbal H. Sarker: Department of Computer Science and Engineering Chittagong University of Engineering and Technology Chittagong Bangladesh
Mohiuddin Ahmed: School of Science Edith Cowan University Joondalup Western Australia Australia
Muhammad Nazrul Islam: Department of Computer Science and Engineering Military Institute of Science and Technology (MIST) Dhaka Bangladesh

DOI: https://doi.org/10.1002/eng2.12572
Journal volume & issue: Vol. 5, no. 3
pp. n/a – n/a

Abstract

Read online

Abstract Since the advent of the worldwide COVID‐19 pandemic, analyzing public sentiment has become one of the major concerns for policy and decision‐makers. While the priority is to curb the spread of the virus, mass population (user) sentiment analysis is equally important. Though sentiment analysis using different state‐of‐the‐art technologies has been focused on during the COVID‐19 pandemic, the reasons behind the variations in public sentiment are yet to be explored. Moreover, how user sentiment varies due to the COVID‐19 pandemic from a cross‐country perspective has been less focused on. Therefore, the objectives of this study are: to identify the most effective machine learning (ML) technique for classifying public sentiments, to analyze the variations of public sentiment across the globe, and to find the critical contributing factors to sentiment variations. To attain the objectives, 12,000 tweets, 3000 each from the USA, UK, and Bangladesh, were rigorously annotated by three independent reviewers. Based on the labeled tweets, four different boosting ML models, namely, CatBoost, gradient boost, AdaBoost, and XGBoost, are investigated. Next, the top performed ML model predicted sentiment of 300,000 data (100,000 from each country). The public perceptions have been analyzed based on the labeled data. As an outcome, the CatBoost model showed the highest (85.8%) F1‐score, followed by gradient boost (84.3%), AdaBoost (78.9%), and XGBoost (83.1%). Second, it was revealed that during the time of the COVID‐19 pandemic, the sentiments of the people of the three countries mainly were negative, followed by positive and neutral. Finally, this study identified a few critical concerns that impact primarily varying public sentiment around the globe: lockdown, quarantine, hospital, mask, vaccine, and the like.

Published in Engineering Reports

ISSN: 2577-8196 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://onlinelibrary.wiley.com/journal/25778196

About the journal

Abstract

Keywords