Machine Learning-Based Opinion Spam Detection: A Systematic Literature Review

Atika Qazi; Najmul Hasan; Rui Mao; Mohamed Elhag Mohamed Abo; Samrat Kumar Dey; Glenn Hardaker

doi:10.1109/ACCESS.2024.3399264

IEEE Access (Jan 2024)

Machine Learning-Based Opinion Spam Detection: A Systematic Literature Review

Atika Qazi,
Najmul Hasan,
Rui Mao,
Mohamed Elhag Mohamed Abo,
Samrat Kumar Dey,
Glenn Hardaker

Affiliations

Atika Qazi: ORCiD; Centre for Lifelong Learning, Universiti Brunei Darussalam, Bandar Seri Begawan, Brunei
Najmul Hasan: ORCiD; BRAC Business School, BRAC University, Dhaka, Bangladesh
Rui Mao: ORCiD; College of Computing and Data Science, Nanyang Technological University, Jurong West, Singapore
Mohamed Elhag Mohamed Abo: ORCiD; Faculty of Computer Science, The Future University, Khartoum, Sudan
Samrat Kumar Dey: ORCiD; School of Science and Technology (SST), Bangladesh Open University (BOU), Gazipur, Bangladesh
Glenn Hardaker: Office of the Provost, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia

DOI: https://doi.org/10.1109/ACCESS.2024.3399264
Journal volume & issue: Vol. 12
pp. 143485 – 143499

Abstract

Read online

The substantial upsurge in Web 2.0 deployment allows a large number of consumers to share their opinions with potential consumers and producers through product and service review platforms. Customer decision-making relies on an important aspect of review: spam-free opinions. Unfortunately, fraudulent activities produced spam reviews that misled potential buyers and traders. Subsequently, it does stop opinion-mining techniques from reaching accurate conclusions. This study aims to identify and review existing state-of-the-art methodologies for three groups: 1) spam reviews; 2) individual spammers; and 3) group spam. The machine learning (ML) and deep learning (DL) techniques for spam detection are categorized, and we establish an assessment that may be deemed appropriate in the field. The findings reveal a total of 10 metrics, with accuracy being the most often used (25%) concerning ML-based techniques in spam detection. Followed by recall in 24% of studies, and precision in 22% of studies. In addition, the F-measure, the area under the curve (AUC), and F1-score evaluation metrics reveal that the use of the Amazon dataset as a whole increased by 7%. This study concludes that the majority of SMS spam filtering strategies are leading solutions. In addition, the taxonomy of existing state-of-the-art methodologies is developed, and it is concluded that a substantial number of studies utilize these existing SMS anti-spam applications. This research uncovered previously unexplored areas of ML and DL’s application to spam review and provided a new paradigm for applying these technologies to the issue. The findings provide both academics and practitioners with a deeper understanding of the barriers to spam review identification, as well as potential improvement opportunities using ML techniques. These individuals will now have the chance to consider a benchmark for future research.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords