IEEE Access (Jan 2021)

A Method for Solving Quasi-Identifiers of Single Structured Relational Data

  • Yi Hua,
  • Zhangbing Li,
  • Baichuan Wang,
  • Jinsheng Li

DOI
https://doi.org/10.1109/ACCESS.2021.3135946
Journal volume & issue
Vol. 9
pp. 166293 – 166302

Abstract

Read online

Quasi-identifier is a set of attributes used to identify the specific entity in structured data, which can provide an inference path for query attacks. Improper selection of quasi-identifiers leads to the failure of current privacy-preserving data publishing. In this paper, we propose a method of solving quasi-identifiers based on functional dependency to ensure the accuracy and completeness of the selected quasi-identifiers for relational data publishing. First, we partition the identifying attributes and sensitive attributes in the relational scheme of relational data published according to the semantic relationship and publishing requirements. Second, we mine the dependencies on identifying attributes with other attributes in the relational schema according to semantics and instance data in relational data, subsequently we can obtain complete quasi-identifiers. Finally, we implement the algorithm for solving quasi-identifiers in Python language, and solve quasi-identifiers on three actual data sets of different sizes, and afterward use the model of 3-anonymity, 2-diversity, and 1-differential privacy for privacy protection experiments. The results demonstrate that the average group records of equivalent class divided on the solved quasi-identifier is 8% smaller than other five methods, and the probability of privacy disclosure is reduced by about 3%. So, the accuracy and completeness of our method are better than other five methods.

Keywords