IEEE Access (Jan 2022)
Fake Online Reviews: A Unified Detection Model Using Deception Theories
Abstract
Online reviews influence consumers’ purchasing decisions. However, identifying fake online reviews automatically remains a complex problem, and current detection approaches are inefficient in preventing the spread of fake reviews. The literature on fake reviews detection lacks a comprehensive and interpretable theory-based model with high performance, which enables us to understand the phenomenon from a psychological perspective and analyze reviews based on user-generated content as well as consumer behavior. In this research, we synthesized ten well-founded deception theories from psychology, namely leakage theory, four-factor theory, interpersonal deception theory, self-presentational theory, reality monitoring theory, criteria-based content analysis, scientific content analysis, verifiability approach, truth-default theory, and information manipulation theory, and selected nine relevant constructs to develop a unified model for detecting fake online reviews. These constructs include specificity, quantity, non-immediacy, affect, uncertainty, informality, consistency, source credibility, and deviation in behavior. We characterized the selected constructs using verbal and non-verbal features to validate the proposed model empirically. Subsequently, we extracted features from the Yelp datasets and used them to train four machine learning algorithms, specifically Logistic Regression, Naïve Bayes, Decision Tree, and Random Forest. We demonstrated that quantity, non-immediacy, affect, informality, consistency, source credibility, and deviation in behavior are essential constructs for detecting fake reviews. To our surprise, we discovered that non-verbal features are more important than verbal features and that combining features from both types improves the prediction performance. Our theory-based model outperformed most of the state-of-the-art fake review detection models and yielded high interpretability and low complexity.
Keywords