IEEE Access (Jan 2024)

Machine Learning-Based Opinion Spam Detection: A Systematic Literature Review

  • Atika Qazi,
  • Najmul Hasan,
  • Rui Mao,
  • Mohamed Elhag Mohamed Abo,
  • Samrat Kumar Dey,
  • Glenn Hardaker

DOI
https://doi.org/10.1109/ACCESS.2024.3399264
Journal volume & issue
Vol. 12
pp. 143485 – 143499

Abstract

Read online

The substantial upsurge in Web 2.0 deployment allows a large number of consumers to share their opinions with potential consumers and producers through product and service review platforms. Customer decision-making relies on an important aspect of review: spam-free opinions. Unfortunately, fraudulent activities produced spam reviews that misled potential buyers and traders. Subsequently, it does stop opinion-mining techniques from reaching accurate conclusions. This study aims to identify and review existing state-of-the-art methodologies for three groups: 1) spam reviews; 2) individual spammers; and 3) group spam. The machine learning (ML) and deep learning (DL) techniques for spam detection are categorized, and we establish an assessment that may be deemed appropriate in the field. The findings reveal a total of 10 metrics, with accuracy being the most often used (25%) concerning ML-based techniques in spam detection. Followed by recall in 24% of studies, and precision in 22% of studies. In addition, the F-measure, the area under the curve (AUC), and F1-score evaluation metrics reveal that the use of the Amazon dataset as a whole increased by 7%. This study concludes that the majority of SMS spam filtering strategies are leading solutions. In addition, the taxonomy of existing state-of-the-art methodologies is developed, and it is concluded that a substantial number of studies utilize these existing SMS anti-spam applications. This research uncovered previously unexplored areas of ML and DL’s application to spam review and provided a new paradigm for applying these technologies to the issue. The findings provide both academics and practitioners with a deeper understanding of the barriers to spam review identification, as well as potential improvement opportunities using ML techniques. These individuals will now have the chance to consider a benchmark for future research.

Keywords