Journal of King Saud University: Computer and Information Sciences (Sep 2022)

Memory-optimized distributed utility mining for big data

  • Sunil kumar,
  • Krishna Kumar Mohbey

Journal volume & issue
Vol. 34, no. 8
pp. 6491 – 6503

Abstract

Read online

In recent days, social media, online services, smartphones, and the Internet of Things (IoT) produces large quantities of data every second. The generated data is structured, unstructured, or semi-structured and available in various formats. Therefore, traditional approaches are not sufficient to handle such kind of data effectively. High utility pattern mining is a famous study area of data analytics that incorporates utility measures to consider user-based constraints such as number of units and benefit, in addition to frequency statistics of datasets. It is also essential to make effective decisions, and their demand is increasing in the last decade. Several techniques have been suggested for utility-based frequent pattern extraction. However, these approaches are limited to data size and operate on standalone systems. We have proposed a parallel approach named distributed memory-optimized utility mining (DMOUM) for high utility-based frequent pattern mining to address this issue. A new pruning technique is implemented that greatly reduces processing time and memory use. The proposed approach can handle huge amounts of data, i.e., big data. DMOUM approach is implemented in cluster-node architecture using Spark. The results are validated on various real-time datasets and found proposed approach has better performance than cutting edge approaches in terms of run-time, memory consumption, and scalability measures. The proposed work will provide solutions to real-time issues such as health care, education, e-commerce, and so on.

Keywords