Baghdad Science Journal (Feb 2024)

Prioritized Text Detergent: Comparing Two Judgment Scales of Analytic Hierarchy Process on Prioritizing Pre-Processing Techniques on Social Media Sentiment Analysis

  • Ummu Hani’ Hair Zaki,
  • Roliana Ibrahim,
  • Shahliza Abd Halim,
  • Izyan Izzati Kamsani

DOI
https://doi.org/10.21123/bsj.2024.9750
Journal volume & issue
Vol. 21, no. 2(SI)

Abstract

Read online

Most companies use social media data for business. Sentiment analysis automatically gathers analyses and summarizes this type of data. Managing unstructured social media data is difficult. Noisy data is a challenge to sentiment analysis. Since over 50% of the sentiment analysis process is data pre-processing, processing big social media data is challenging too. If pre-processing is carried out correctly, data accuracy may improve. Also, sentiment analysis workflow is highly dependent. Because no pre-processing technique works well in all situations or with all data sources, choosing the most important ones is crucial. Prioritization is an excellent technique for choosing the most important ones. As one of many Multi-Criteria Decision Making (MCDM) methods, the Analytic Hierarchy Process (AHP) is preferred for handling complicated decision-making challenges using several criteria. The Consistency Ratio (CR) scores were used to examine pair-wise comparisons to evaluate the AHP. This study used two judgment scales to get the most consistent judgment. Firstly, the Saaty judgment scale (SS), then the Generalized Balanced Scale (GBS). It investigated whether two different AHP judgment scales would affect decision-making. The main criteria for prioritizing pre-processing techniques in sentiment analysis are Punctuation, Spelling, Number, and Context. These four criteria also contain sub-criteria. GBS pair-wise comparisons are closer to the CR value than SS, reducing the alternatives’ weight ratios. This paper explains how AHP aids logical decision-making. Prioritizing pre-processing techniques with AHP can be a paradigm for other sentiment analysis stages. In short, this paper adds another contribution to the Big Data Analytics domain.

Keywords