Tehnički Vjesnik (Jan 2024)
Sentiment Analysis on Big Data: A Hybrid SED-TABU Feature Selection Method
Abstract
Big data mining is a crucial component of contemporary decision support systems linked to social networks and other data sources. Sentiment Analysis (SA) is the process by which text analytics is used to mine many data sources for opinions. This research seeks to create a feature selection method for sentiment analysis that is efficient and robust against noise and high dimensionality in Big data environments. The objective is to choose a condensed collection of useful features that increases sentiment categorization precision. It is suggested to use a novel hybrid feature selection method that combines Tabu Search (TS) and Stream Evolution Dynamics (SED). SED offers exploratory power, and TS offers exploitation. The classifier assesses the performance for each feature subset that SED-TS chose. Instances are classified using the AdaBoost classifier. The suggested method was assessed using data from Amazon product reviews. As a result, our technique outperforms wrapper and filter-based feature selection methods. By extracting a small feature subset, the SED-TS hybrid technique attained the best accuracy of 93% and an F1 score of 0.95. The work effectively combined SED and TS for feature selection specifically suited to sentiment analysis on Big data. The hybrid strategy offers higher accuracy and better generalization by utilizing the complementing characteristics of the two strategies. This shows how metaheuristic approaches can be used to classify sentiment in high-dimensional noisy data.
Keywords