IEEE Access (Jan 2023)

Efficient Method for Mining High Utility Occupancy Patterns Based on Indexed List Structure

  • Hyeonmo Kim,
  • Taewoong Ryu,
  • Chanhee Lee,
  • Sinyoung Kim,
  • Bay Vo,
  • Jerry Chun-Wei Lin,
  • Unil Yun

DOI
https://doi.org/10.1109/ACCESS.2023.3271864
Journal volume & issue
Vol. 11
pp. 43140 – 43158

Abstract

Read online

High utility pattern mining has been proposed to improve the traditional support-based pattern mining methods that process binary databases. High utility patterns are discovered by effectively considering the quantity and importance of items. Recently, high utility occupancy pattern mining studies have been conducted to extract high-quality patterns by utilizing both the occupancy utility and frequency measure. Although the previous approaches provide worthy information in terms of utility occupancy, they require time-consuming tasks because of numerous comparison operations in exploring entries in global data structures. This results in significant performance degradation when the database is large, or a pre-defined threshold is low. An indexed list structure improves the inefficiency of the list-based approach by structurally connecting each tuple. In this paper, we propose an efficient high utility occupancy mining approach based on novel indexed list-based structures. The two newly designed data structures maintain index information on items or patterns and facilitate rapid pattern extension. Our approach improves the cost of generating long patterns of list-based ones by reducing a large number of comparison overheads. In addition, we devise novel constructing and mining methods that are suitable for the proposed data structures and utility occupancy functions. To narrow the wide search space, efficient pruning techniques apply to the designed methods. Thorough performance experiments using real and synthetic datasets show that our method is more efficient than state-of-the-art methods in environments where given thresholds change.

Keywords