Hybrid Parallel Frequent Itemsets Mining Algorithm by Using N-List Structure

LIU Weiming, ZHANG Chi, MAO Yimin

doi:10.3778/j.issn.1673-9418.2008068

Jisuanji kexue yu tansuo (Jan 2022)

Hybrid Parallel Frequent Itemsets Mining Algorithm by Using N-List Structure

LIU Weiming, ZHANG Chi, MAO Yimin

Affiliations

LIU Weiming, ZHANG Chi, MAO Yimin: School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China

DOI: https://doi.org/10.3778/j.issn.1673-9418.2008068
Journal volume & issue: Vol. 16, no. 1
pp. 120 – 136

Abstract

Read online

Aiming at the problem of unbalanced load, bad efficiency of N-list merge and redundant search for each node based on parallel frequent itemset mining algorithm MRPrePost (parallel PrePost algorithm based on MapReduce), this paper proposes a hybrid parallel frequent itemset mining algorithm based on N-list (HP-FIMBN). Firstly, a load estimation function (LE) is designed to calculate the load of each item in F-list, and a grouping method based on greedy strategy (GM-GS) is proposed, which not only deals with the problem of unbalanced load in the process of data partitioning, but also decreases the size of sub-PPC-tree on local node. Secondly, in order to accelerate the completion of the merging of two N-list structures, it designs an early abandon strategy (EAS), which can not only efficiently avoid invalid calculations in the merging process, but also does not need to traverse the initial N-list structure to get the final result. Finally, the set-enumeration tree is adopted as the search space, a superset equivalent strategy (SES) is proposed to avoid redundant search in the mining process and the final mining results are generated. Experimental results show that the modified algorithm has better performance on mining frequent itemsets in big data environment.

|frequent item mining|n-list|greedy strategy|set-enumeration tree|superset equivalent strategy (ses)

Published in Jisuanji kexue yu tansuo

ISSN: 1673-9418 (Print)
Publisher: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://fcst.ceaj.org

About the journal

Abstract

Keywords