Cybernetics and Information Technologies (Dec 2015)

Mining Similar Traces of Entities on Web

  • Huang Xinyan,
  • Wang Xinjun,
  • Li Hui

DOI
https://doi.org/10.1515/cait-2015-0081
Journal volume & issue
Vol. 15, no. 6
pp. 219 – 229

Abstract

Read online

Events about entities have been widely collected on Web, allowing us to analyze how peer entities interact and learn the relationships that exist among the entities. In this paper we investigate similar traces that have not been adequately studied so far. Intuitively, peer entities tend to have similar traces. The challenges in mining similar traces are: (1) the occurring time lags of traces are usually unknown and varying; (2) the existence of large-scale events of entities and complexity of the model representing all the events. In this paper we propose a simple, but practical method that addresses all these challenges. Firstly, sliding windows are adopted to filter out the significant events and then find the candidate topic sequences. Secondly, dynamic programming is employed to mine similar candidate topic sequences of entities. Finally, an efficient method is proposed to mine all the similar traces of entities. It is able to mine similar traces of peer entities with high accuracy. We conduct comprehensive experiments on synthetic datasets to demonstrate the efficiency of the method proposed.

Keywords