Frontiers in Public Health (Dec 2022)
Nowcasting unemployment rate during the COVID-19 pandemic using Twitter data: The case of South Africa
Abstract
The global economy has been hard hit by the COVID-19 pandemic. Many countries are experiencing a severe and destructive recession. A significant number of firms and businesses have gone bankrupt or been scaled down, and many individuals have lost their jobs. The main goal of this study is to support policy- and decision-makers with additional and real-time information about the labor market flow using Twitter data. We leverage the data to trace and nowcast the unemployment rate of South Africa during the COVID-19 pandemic. First, we create a dataset of unemployment-related tweets using certain keywords. Principal Component Regression (PCR) is then applied to nowcast the unemployment rate using the gathered tweets and their sentiment scores. Numerical results indicate that the volume of the tweets has a positive correlation, and the sentiments of the tweets have a negative correlation with the unemployment rate during and before the COVID-19 pandemic. Moreover, the now-casted unemployment rate using PCR has an outstanding evaluation result with a low Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), Symmetric MAPE (SMAPE) of 0.921, 0.018, 0.018, respectively and a high R2-score of 0.929.
Keywords