IEEE Access (Jan 2022)

A Scalable Analytical Framework for Complex Event Episode Mining With Various Domains Applications

  • Jerry C. C. Tseng,
  • Sun-Yuan Hsieh,
  • Vincent S. Tseng

DOI
https://doi.org/10.1109/ACCESS.2022.3228962
Journal volume & issue
Vol. 10
pp. 130672 – 130685

Abstract

Read online

With the ubiquity of sensor networks and smart devices that continuously collect data, we face the challenge of analyzing the growing stream of data in real time. In recent years, there has been a huge need to gain useful knowledge by incrementally analyzing event sequence data. Although episode pattern mining techniques have existed for years, people have recently become more aware of their practical value in solving real-life domain problems such as manufacturing records, stock markets, and weather forecasts. The effective and efficient application of episode pattern mining techniques to analyze complex event data is becoming increasingly important for solving real-life problems in wide domains. However, few studies have focused on developing a scalable framework based on episode pattern mining of complex event sequences for applications in various domains. In this work, we propose a novel framework named SAAF (Scalable Analytical Application Framework) based on complex event episode mining techniques, including batch episode mining, delta episode mining, incremental episode mining, and pattern merging, to consider both efficiency and accuracy. Moreover, to enhance scalability, we adopt the lambda architecture with Apache Spark and Apache Spark Streaming as the system development framework. Finally, the experimental results on three real datasets of different domains and two benchmark datasets showed that the proposed SAAF framework exhibits excellent performance in terms of efficiency, accuracy, and scalability.

Keywords