EPJ Data Science (Oct 2017)

Gaining historical and international relations insights from social media: spatio-temporal real-world news analysis using Twitter

  • Vanessa Peña-Araya,
  • Mauricio Quezada,
  • Barbara Poblete,
  • Denis Parra

DOI
https://doi.org/10.1140/epjds/s13688-017-0122-8
Journal volume & issue
Vol. 6, no. 1
pp. 1 – 35

Abstract

Read online

Abstract The immense growth of the social Web, which has made a large amount of user data easily and publicly available, has opened a whole new spectrum for research in social behavioral sciences. However, as the volume of social media content increases at a very fast rate, it becomes extremely difficult to systematically obtain high-level information from this data. As a consequence, tasks related to the analysis of historical news events based on social media data have not been explored, which limits any type of comparative historical research, causality analysis, and discovery of knowledge from patterns extracted from aggregated social media event information. In this work, we target this issue by proposing a compact high-level representation of news events using social media information. This representation explicitly includes temporal information about the event and information about locations, in particular of geopolitical entities. We call this a spatio-temporal context-aware event representation. Our hypothesis is that by including social, temporal, and spatial information in the event representation, we are enabling the analysis of historical world news from a social and geopolitical perspective. This facilitates, new information retrieval tasks related to historical event information extraction and international relations analysis. We support our claims by presenting two applications of this idea: the first, a visual tool, named Galean, for retrieval and exploration of historical news events within their geopolitical and temporal context. The second, a quantitative analysis of a 2-year Twitter dataset of news events reported by U.S. and U.K. media, which we explore using data mining techniques on our event representations. We present two case studies of event exploration using Galean and user evaluation of this tool, as well as details of our data mining empirical results.

Keywords