Data Science and Engineering (Sep 2019)

Mapping Entity Sets in News Archives Across Time

  • Yijun Duan,
  • Adam Jatowt,
  • Sourav S. Bhowmick,
  • Masatoshi Yoshikawa

DOI
https://doi.org/10.1007/s41019-019-00102-3
Journal volume & issue
Vol. 4, no. 3
pp. 208 – 222

Abstract

Read online

Abstract We propose a novel way of utilizing and accessing information stored in news archives as well as a new style of investigating the history. Our idea is to automatically generate similar entity pairs given two sets of entities, one from the past and one representing the present. This allows performing entity-oriented mapping between different times. We introduce an effective method to solve the aforementioned task based on a concise integer linear programming framework. In particular, our model first conducts typicality analysis to estimate entity representativeness. It next constructs orthogonal transformation between the two entity collections. The result is a set of typical across-time comparables. We demonstrate the effectiveness of our approach on the New York Times dataset through both qualitative and quantitative tests.

Keywords