IEEE Access (Jan 2018)

Frequent Itemsets Mining With Differential Privacy Over Large-Scale Data

  • Xinyu Xiong,
  • Fei Chen,
  • Peizhi Huang,
  • Miaomiao Tian,
  • Xiaofang Hu,
  • Badong Chen,
  • Jing Qin

DOI
https://doi.org/10.1109/ACCESS.2018.2839752
Journal volume & issue
Vol. 6
pp. 28877 – 28889

Abstract

Read online

Frequent itemsets mining with differential privacy refers to the problem of mining all frequent itemsets whose supports are above a given threshold in a given transactional dataset, with the constraint that the mined results should not break the privacy of any single transaction. Current solutions for this problem cannot well balance efficiency, privacy, and data utility over large-scale data. Toward this end, we propose an efficient, differential private frequent itemsets mining algorithm over large-scale data. Based on the ideas of sampling and transaction truncation using length constraints, our algorithm reduces the computation intensity, reduces mining sensitivity, and thus improves data utility given a fixed privacy budget. Experimental results show that our algorithm achieves better performance than prior approaches on multiple datasets.

Keywords