SAGE Open (Dec 2020)

Using XGBoost and Skip-Gram Model to Predict Online Review Popularity

  • Lien Thi Kim Nguyen,
  • Hao-Hsuan Chung,
  • Kristine Velasquez Tuliao,
  • Tom M. Y. Lin

DOI
https://doi.org/10.1177/2158244020983316
Journal volume & issue
Vol. 10

Abstract

Read online

Review popularity is similar to awareness and information accessibility components: Both have a profound effect on customer purchase decisions. Therefore, this study proposes a new method for predicting online review popularity that combines the extreme gradient boosting tree algorithm (XGBoost), to extract key features on the bases of ranking scores and the skip-gram model, which can subsequently identify semantic words according to key textual terms. Findings revealed that written reviews had higher review popularity than non-textual reviews (reviewer and product factors). Moreover, the proposed method achieved higher prediction accuracy than the traditional ridge regression technique of Root Mean Squared Logarithmic Error (RMSLE). The main factors affecting review popularity and key reviewers for specific textual terms were also identified. Findings could help vendors identify key influencers for their product promotion and then support the design of word-suggestion systems for online reviews.