IEEE Access (Jan 2023)

FTKHUIM: A Fast and Efficient Method for Mining Top-K High-Utility Itemsets

  • Vinh V. Vu,
  • Mi T. H. Lam,
  • Thuy T. M. Duong,
  • Ly T. Manh,
  • Thuy T. T. Nguyen,
  • Le V. Nguyen,
  • Unil Yun,
  • Vaclav Snasel,
  • Bay Vo

DOI
https://doi.org/10.1109/ACCESS.2023.3314984
Journal volume & issue
Vol. 11
pp. 104789 – 104805

Abstract

Read online

High-utility itemset mining (HUIM) is an important task in the field of knowledge data discovery. The large search space and huge number of HUIs are the consequences of applying HUIM algorithms with an inappropriate user-defined minimum utility threshold value. Determining a suitable threshold value to obtain the expected results is not a simple task and requires spending a lot of time. For common users, it is difficult to define a minimum threshold utility for exploring the right number of HUIs. On the one hand, if the threshold is set too high then the number of HUIs would not be enough. On the other hand, if the threshold is set too low, too many HUIs will be mined, thus wasting both time and memory. The top-k HUIs mining problem was proposed to solve this issue, and many effective algorithms have since been introduced by researchers. In this research, a novel approach, namely FTKHUIM (Fast top-k HUI Mining), is introduced to explore the top-k HUIs. One new threshold-raising strategy called RTU, a transaction utility (TU)-based threshold-raising strategy, has also been shown to rapidly increase the speed of top-k HUIM. The study also proposes a global structure to store utility values in the process of applying raising-threshold strategies to optimize these strategies. The results of experiments on various datasets prove that the FTKHUIM algorithm achieves better results with regard to both the time and search space needed.

Keywords