Selective Database Projections Based Approach for Mining High-Utility Itemsets

Anita Bai; Parag S. Deshpande; Meera Dhabu

doi:10.1109/ACCESS.2017.2788083

IEEE Access (Jan 2018)

Selective Database Projections Based Approach for Mining High-Utility Itemsets

Anita Bai,
Parag S. Deshpande,
Meera Dhabu

Affiliations

Anita Bai: ORCiD; Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, India
Parag S. Deshpande: Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, India
Meera Dhabu: Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, India

DOI: https://doi.org/10.1109/ACCESS.2017.2788083
Journal volume & issue: Vol. 6
pp. 14389 – 14409

Abstract

Read online

High-utility itemset mining (HilIM) is an emerging area of data mining and is widely used. HilIM differs from the frequent itemset mining (FIM), as the latter considers only the frequency factor, whereas the former has been designed to address both quantity and profit factors to reveal the most profitable products. The challenges of generating the HilI include exponential complexity in both time and space. Moreover, the pruning techniques of reducing the search space, which is available in FIM because of their monotonic and anti-monotonic properties, cannot be used in HilIM. In this paper, we propose a novel selective database projection-based HilI mining algorithm (SPHilI-Miner). We introduce an efficient data format, named HilI-RTPL, which is an optimum and compact representation of data requiring low memory. We also propose two novel data structures, viz, selective database projection utility list and Tail-Count list to prune the search space for HilI mining. Selective projections of the database reduce the scanning time of the database making our proposed approach more efficient. It creates unique data instances and new projections for data having less dimensions thereby resulting in faster HilI mining. We also prove upper bounds on the amount of memory consumed by these projections. Experimental comparisons on various benchmark data sets show that the SPHilI-Miner algorithm outperforms the state-of-the-art algorithms in terms of computation time, memory usage, scalability, and candidates generation.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords