Sentiment Analysis on Big Data: A Hybrid SED-TABU Feature Selection Method

Sabitha Rajagopal; Sreemathy Jayaprakash; Karthik Subburathinam

doi:10.17559/TV-20231004000989

Tehnički Vjesnik (Jan 2024)

Sentiment Analysis on Big Data: A Hybrid SED-TABU Feature Selection Method

Sabitha Rajagopal,
Sreemathy Jayaprakash,
Karthik Subburathinam

Affiliations

Sabitha Rajagopal: Department of Computer Science and Engineering, SNS College of Technology, Tamilnadu 641035, India
Sreemathy Jayaprakash: Department of Computer Science and Engineering, Sri Eshwar College of Engineering, Kinathakadavu, Coimbatore, Tamilnadu 641202, India
Karthik Subburathinam: Department of Computer Science and Engineering, SNS College of Technology, Coimbatore, Tamilnadu 641035, India

DOI: https://doi.org/10.17559/TV-20231004000989
Journal volume & issue: Vol. 31, no. 6
pp. 2079 – 2086

Abstract

Read online

Big data mining is a crucial component of contemporary decision support systems linked to social networks and other data sources. Sentiment Analysis (SA) is the process by which text analytics is used to mine many data sources for opinions. This research seeks to create a feature selection method for sentiment analysis that is efficient and robust against noise and high dimensionality in Big data environments. The objective is to choose a condensed collection of useful features that increases sentiment categorization precision. It is suggested to use a novel hybrid feature selection method that combines Tabu Search (TS) and Stream Evolution Dynamics (SED). SED offers exploratory power, and TS offers exploitation. The classifier assesses the performance for each feature subset that SED-TS chose. Instances are classified using the AdaBoost classifier. The suggested method was assessed using data from Amazon product reviews. As a result, our technique outperforms wrapper and filter-based feature selection methods. By extracting a small feature subset, the SED-TS hybrid technique attained the best accuracy of 93% and an F1 score of 0.95. The work effectively combined SED and TS for feature selection specifically suited to sentiment analysis on Big data. The hybrid strategy offers higher accuracy and better generalization by utilizing the complementing characteristics of the two strategies. This shows how metaheuristic approaches can be used to classify sentiment in high-dimensional noisy data.

Published in Tehnički Vjesnik

ISSN: 1330-3651 (Print); 1848-6339 (Online)
Publisher: Faculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in Osijek
Country of publisher: Croatia
LCC subjects: Technology: Engineering (General). Civil engineering (General)
Website: http://hrcak.srce.hr/tehnicki-vjesnik

About the journal

Abstract

Keywords