Algorithms (Oct 2013)
Pattern-Guided k-Anonymity
Abstract
We suggest a user-oriented approach to combinatorial data anonymization. A data matrix is called k-anonymous if every row appears at least k times—the goal of the NP-hard k-ANONYMITY problem then is to make a given matrix k-anonymous by suppressing (blanking out) as few entries as possible. Building on previous work and coping with corresponding deficiencies, we describe an enhanced k-anonymization problem called PATTERN-GUIDED k-ANONYMITY, where the users specify in which combinations suppressions may occur. In this way, the user of the anonymized data can express the differing importance of various data features. We show that PATTERN-GUIDED k-ANONYMITY is NP-hard. We complement this by a fixed-parameter tractability result based on a “data-driven parameterization” and, based on this, develop an exact integer linear program (ILP)-based solution method, as well as a simple, but very effective, greedy heuristic. Experiments on several real-world datasets show that our heuristic easily matches up to the established “Mondrian” algorithm for k-ANONYMITY in terms of the quality of the anonymization and outperforms it in terms of running time.
Keywords