ICTACT Journal on Soft Computing (Jul 2022)

TOP-DOWN AND BOTTOM-UP APPROACH FOR MINING MULTILEVEL ASSOCIATION RULES FROM CONCEPT HIERARCHICAL DATA IN DISTRIBUTED ENVIRONMENT

  • Dinesh J. Prajapati

DOI
https://doi.org/10.21917/ijsc.2022.0385
Journal volume & issue
Vol. 12, no. 4
pp. 2697 – 2706

Abstract

Read online

Hierarchical Data mining using distributed environment is an imperative in big data analysis. Multilevel association rules can provide more substantial information than single level rules, and it also determines hierarchical knowledge from the dataset. Nowadays, numerous e-commerce and social networking sites generates vast amount of structural/semi-structural data in the form of sales data, tweets, text mails, web usages and so on. The data generated from such sources is so large that it becomes very difficult to process and analyze it using conventional approaches. This paper overcomes the computing limitation of single node by distributing the task on multi-node cluster. The performance of this system is compared based on minimum support threshold at diverse levels of concept hierarchy and by varying the dataset size. In this paper, the transactional dataset is created from huge sales dataset using Hadoop MapReduce framework. Then, two distributed multilevel frequent pattern mining algorithms MR-MLAB (MapReduce based Multilevel Apriori using Bottom-up approach) and MR-MLAT (MapReduce based Multilevel Apriori using Top-down approach) are implemented to find interesting level-crossing frequent itemset for each level of concept hierarchy. The hierarchical redundancy in multilevel association rules affects the quality of the market basket analysis. Hence, to improve the performance of the system, the hierarchical redundancy has to be removed from it. Finally, the time efficiency of proposed algorithms is compared with existing Traditional Multilevel Apriori (TMLA) Algorithm. The proposed algorithms with MapReduce framework are found efficient compared to the traditional algorithms.

Keywords