IEEE Access (Jan 2020)

Approximation of Probabilistic Maximal Frequent Itemset Mining Over Uncertain Sensed Data

  • Sheng Chen,
  • Lihai Nie,
  • Xiaoyi Tao,
  • Zhiyang Li,
  • Laiping Zhao

DOI
https://doi.org/10.1109/ACCESS.2020.2997409
Journal volume & issue
Vol. 8
pp. 97529 – 97539

Abstract

Read online

Event detection by discovering frequent itemsets is very popular in sensor network communities. However, the recorded data is often a probability rather than a determined value in a really productive environment as sensed data is often affected by noise. In this paper, we study to detect events by finding frequent patterns over probabilistic sensor data under the Possible World Semantics. This is technically challenging as probabilistic records can generate an exponential number of possible worlds. Although several efficient algorithms are proposed in the literature, it is still difficult to mine probabilistic maximal frequent items (PMFIs) in large uncertain database due to the high time complexity. To address this issue, we employ approximate idea to further reduce the time complexity from O(nlogn) to O(n) and propose a two-step solution (Aproximation Probabilistic Frequent Itemset-MAX, APFI-MAX in short) including PMFI candidates generation and PMFIs confirmation. We also provide the necessary proofs of our approximation method to make APFI-MAX more solid and convincing. Finally, extensive experiments have been conducted on synthetic and real databases, demonstrating that the proposed APFI-MAX always running faster than state-of-art methods under different parameter settings.

Keywords