IEEE Access (Jan 2024)
Frequent Closed High-Utility Itemset Mining Algorithm Based on Leiden Community Detection and Compact Genetic Algorithm
Abstract
Traditional pattern mining algorithms are based on tree and linked list structures. However, they often only consider a single factor of frequency or utility and have to deal with exponential search spaces as well as generate numerous candidates. Thus, we propose a frequent closed high-utility itemset mining (FCHUIM) algorithm based on Leiden community detection and a compact genetic algorithm (LcGA). This algorithm first employs Leiden community detection to decompose a dataset into several highly related transaction communities and then uses an approximate or exact strategy to mine frequent itemsets. Subsequently, it checks for closures in the mined frequent itemsets and finally employs the compact genetic algorithm to efficiently mine high-utility patterns from the frequent closed itemsets. Experimental results on four real datasets, namely Retail, Chainstore, Chess, Accidents,among which the datatype of the former two is Sparse and that of the latter two is Dense, demonstrate that compared with modified traditional closed high-utility mining algorithms, including CHUI-Miner-S, CLS-Miner-S and the most advanced algorithms for mining frequent closed high-utility mining algorithms called FCHUIM, the average runtime of LcGA-FCHUIM is lower by 40.5% than the optimal contrastive algorithm. Moreover, LcGA-FCHUIM can mine an average of 96% of frequent closed high-utility itemsets, making it an effective algorithm for frequent closed high-utility itemset mining and suitable for most scenarios.
Keywords