Data Science and Engineering (Sep 2023)

A Reinduction-Based Approach for Efficient High Utility Itemset Mining from Incremental Datasets

  • Pushp Sra,
  • Satish Chand

DOI
https://doi.org/10.1007/s41019-023-00229-4
Journal volume & issue
Vol. 9, no. 1
pp. 73 – 87

Abstract

Read online

Abstract High utility itemset mining is a crucial research area that focuses on identifying combinations of itemsets from databases that possess a utility value higher than a user-specified threshold. However, most existing algorithms assume that the databases are static, which is not realistic for real-life datasets that are continuously growing with new data. Furthermore, existing algorithms only rely on the utility value to identify relevant itemsets, leading to even the earliest occurring combinations being produced as output. Although some mining algorithms adopt a support-based approach to account for itemset frequency, they do not consider the temporal nature of itemsets. To address these challenges, this paper proposes the Scented Utility Miner (SUM) algorithm that uses a reinduction strategy to track the recency of itemset occurrence and mine itemsets from incremental databases. The paper provides a novel approach for mining high utility itemsets from dynamic databases and presents several experiments that demonstrate the effectiveness of the proposed approach.

Keywords