An Efficient Tree-Based Algorithm for Mining High Average-Utility Itemset

Irfan Yildirim; Mete Celik

doi:10.1109/ACCESS.2019.2945840

IEEE Access (Jan 2019)

An Efficient Tree-Based Algorithm for Mining High Average-Utility Itemset

Irfan Yildirim,
Mete Celik

Affiliations

Irfan Yildirim: ORCiD; Department of Computer Engineering, Erciyes University, Kayseri, Turkey
Mete Celik: Department of Computer Engineering, Erciyes University, Kayseri, Turkey

DOI: https://doi.org/10.1109/ACCESS.2019.2945840
Journal volume & issue: Vol. 7
pp. 144245 – 144263

Abstract

Read online

High-utility itemset mining (HUIM), which is an extension of well-known frequent itemset mining (FIM), has become a key topic in recent years. HUIM aims to find a complete set of itemsets having high utilities in a given dataset. High average-utility itemset mining (HAUIM) is a variation of traditional HUIM. HAUIM provides an alternative measurement named the average-utility to discover the itemsets by taking into consideration both of the utility values and lengths of itemsets. HAUIM is important for several application domains, such as, business applications, medical data analysis, mobile commerce, streaming data analysis, etc. In the literature, several algorithms have been proposed by introducing their own upper-bound models and data structures to discover high average utility itemsets (HAUIs) in a given database. However, they require long execution times and large memory consumption to handle the problem. To overcome these limitations, this paper, first, introduces four novel upper-bounds along with pruning strategies and two data structures. Then, it proposes a pattern growth approach called the HAUL-Growth algorithm for efficiently mining of HAUIs using the proposed upper-bounds and data structures. Experimental results show that the proposed HAUL-Growth algorithm significantly outperforms the state-of-the-art dHAUIM and TUB-HAUIM algorithms in terms of execution times, number of join operations, memory consumption, and scalability.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords