IEEE Access (Jan 2024)

Incremental Top-k High Utility Pattern Mining and Analyzing Over the Entire Accumulated Dynamic Database

  • Chanhee Lee,
  • Hanju Kim,
  • Myungha Cho,
  • Hyeonmo Kim,
  • Bay Vo,
  • Jerry Chun-Wei Lin,
  • Philippe Fournier-Viger,
  • Unil Yun

DOI
https://doi.org/10.1109/ACCESS.2024.3406562
Journal volume & issue
Vol. 12
pp. 77605 – 77620

Abstract

Read online

Top-k high utility pattern mining, which extracts the highest top-k patterns that the users want to find, has been actively studied. Most previous studies in this domain have focused on static databases, where data insertions do not occur. In the real world, however, various applications continuously generate new data, and existing top-k high utility pattern mining algorithms devised to process static databases cannot handle incremental databases. Although some methods can handle stream data, they have the limitation of processing a portion of the database rather than the entire accumulated database. In this paper, we suggest an efficient incremental mining method that discovers top-k high utility patterns from the entire accumulated database. The proposed approach utilizes a list structure that stores minimal utility information required for the mining process and does not generate candidate itemsets. The suggested algorithm processes the incremental data with a single database scan and restructures the list for efficient mining. Moreover, four efficient threshold raising techniques along with a restoring technique are utilized to calculate the optimal threshold value in an accumulated incremental environment. The results of the experiments on runtime, memory, and scalability show that the suggested method efficiently processes the entire incremental database.

Keywords