Information (Jul 2020)

Automated Seeded Latent Dirichlet Allocation for Social Media Based Event Detection and Mapping

  • Cornelia Ferner,
  • Clemens Havas,
  • Elisabeth Birnbacher,
  • Stefan Wegenkittl,
  • Bernd Resch

DOI
https://doi.org/10.3390/info11080376
Journal volume & issue
Vol. 11, no. 8
p. 376

Abstract

Read online

In the event of a natural disaster, geo-tagged Tweets are an immediate source of information for locating casualties and damages, and for supporting disaster management. Topic modeling can help in detecting disaster-related Tweets in the noisy Twitter stream in an unsupervised manner. However, the results of topic models are difficult to interpret and require manual identification of one or more “disaster topics”. Immediate disaster response would benefit from a fully automated process for interpreting the modeled topics and extracting disaster relevant information. Initializing the topic model with a set of seed words already allows to directly identify the corresponding disaster topic. In order to enable an automated end-to-end process, we automatically generate seed words using older Tweets from the same geographic area. The results of two past events (Napa Valley earthquake 2014 and hurricane Harvey 2017) show that the geospatial distribution of Tweets identified as disaster related conforms with the officially released disaster footprints. The suggested approach is applicable when there is a single topic of interest and comparative data available.

Keywords