IEEE Access (Jan 2016)

PTA: An Efficient System for Transaction Database Anonymization

  • Jerry Chun-Wei Lin,
  • Qiankun Liu,
  • Philippe Fournier-Viger,
  • Tzung-Pei Hong

DOI
https://doi.org/10.1109/ACCESS.2016.2596542
Journal volume & issue
Vol. 4
pp. 6467 – 6479

Abstract

Read online

Several approaches have been proposed to anonymize relational databases using the criterion of k-anonymity, to avoid the disclosure of sensitive information by re-identification attacks. A relational database is said to meet the criterion of k-anonymity if each record is identical to at least (k - 1) other records in terms of quasi-identifier attribute values. To anonymize a transactional database and satisfy the constraint of k-anonymity, each item must successively be considered as a quasi-identifier attribute. But this process greatly increases dimensionality, and thus also the computational complexity of anonymization, and information loss. In this paper, a novel efficient anonymization system called PTA is proposed to not only anonymize transactional data with a small information loss but also to reduce the computational complexity of the anonymization process. The PTA system consists of three modules, which are the Pre-processing module, the TSP module, and the Anonymity model, to anonymize transactional data and guarantees that at least k-anonymity is achieved: a pre-processing module, a traveling salesman problem module, and an anonymization module. Extensive experiments have been carried to compare the efficiency of the designed approach with the state-of-the-art anonymization algorithms in terms of scalability, runtime, and information loss. Results indicate that the proposed PTA system outperforms the compared algorithms in all respects.

Keywords