Complexity (Jan 2021)

Fast Detection of Deceptive Reviews by Combining the Time Series and Machine Learning

  • Minjuan Zhong,
  • Zhenjin Li,
  • Shengzong Liu,
  • Bo Yang,
  • Rui Tan,
  • Xilong Qu

DOI
https://doi.org/10.1155/2021/9923374
Journal volume & issue
Vol. 2021

Abstract

Read online

With the rapid growth of online product reviews, many users refer to others’ opinions before deciding to purchase any product. However, unfortunately, this fact has promoted the constant use of fake reviews, resulting in many wrong purchase decisions. The effective identification of deceptive reviews becomes a crucial yet challenging task in this research field. The existing supervised learning methods require a large number of labeled examples of deceptive and truthful opinions by domain experts, while the available unsupervised learning methods are inefficient because they depend on the features of reviewers to detect each fake review. Therefore, by focusing on the detection efficiency problem and the limitation of large amount of labeled examples dependence, in this paper, we proposed an effective semisupervised learning approach for detecting spam reviews. Firstly, a time series model of all the reviews of a product is constructed, and then the suspected time intervals are captured based on the burst review increases in these intervals. Secondly, a co-training two-view semisupervised learning algorithm was performed in each captured interval, in which linguistic cues, metadata, and user purchase behaviors were synthetically employed to classify the reviews and check whether they are spam ones or not. A series of numerical experiments on a real dataset acquired from Taobao.com have confirmed the effectiveness of the proposed model, not only reaping benefits in terms of time efficiency and high accuracy but also overcoming the shortcomings of supervised learning methods, which depend on large amounts of labeled examples. And a trade-off balance was obtained between accuracy and efficiency.