Symmetry (Apr 2020)

Using the Least Squares Support Vector Regression to Forecast Movie Sales with Data from Twitter and Movie Databases

  • Yi-Ting Huang,
  • Ping-Feng Pai

DOI
https://doi.org/10.3390/sym12040625
Journal volume & issue
Vol. 12, no. 4
p. 625

Abstract

Read online

Due to the rapid prominence and popularity of social media, social broadcasting networks with voluntary information sharing have become one of the most powerful ways to spread word-of-mouth opinions, and thus, have influence on consumers’ preferences toward products. Therefore, sentiment analysis data from social media have become more important in forecasting product sales. For the movie industry, the opinions expressed on social media have increasing impacts on movie sales. In addition, some databases, such as the Box Office Mojo and Internet Movie Database (IMDb), contain structured data for predicting movie sales. Thus, three categories of data—data of movie databases, data of tweets, and hybrid data including movies databases and tweets—are employed symmetrically in this study. The aim of this study is to employ the least squares support vector regression (LSSVR) to forecast movie sales worldwide according to these three forms of data. In addition, three other forecasting techniques—namely, the back propagation neural network (BPNN), the generalized regression neural network (GRNN), and the multivariate linear regression (MLR) model—were used to forecast movie sales with the three types of data. The empirical results show that the LSSVR model with hybrid data can obtain more accurate results than the other forecasting models with all data types. Thus, forecasting movie sales using the LSSSVR model with data containing movie databases and tweets is a feasible and prospective method to forecast movie sales.

Keywords