A game theoretic framework for analyzing re-identification risk.

Zhiyu Wan; Yevgeniy Vorobeychik; Weiyi Xia; Ellen Wright Clayton; Murat Kantarcioglu; Ranjit Ganta; Raymond Heatherly; Bradley A Malin

doi:10.1371/journal.pone.0120592

PLoS ONE (Jan 2015)

A game theoretic framework for analyzing re-identification risk.

Zhiyu Wan,
Yevgeniy Vorobeychik,
Weiyi Xia,
Ellen Wright Clayton,
Murat Kantarcioglu,
Ranjit Ganta,
Raymond Heatherly,
Bradley A Malin

Affiliations

Zhiyu Wan
Yevgeniy Vorobeychik
Weiyi Xia
Ellen Wright Clayton
Murat Kantarcioglu
Ranjit Ganta
Raymond Heatherly
Bradley A Malin

DOI: https://doi.org/10.1371/journal.pone.0120592
Journal volume & issue: Vol. 10, no. 3
p. e0120592

Abstract

Read online

Given the potential wealth of insights in personal data the big databases can provide, many organizations aim to share data while protecting privacy by sharing de-identified data, but are concerned because various demonstrations show such data can be re-identified. Yet these investigations focus on how attacks can be perpetrated, not the likelihood they will be realized. This paper introduces a game theoretic framework that enables a publisher to balance re-identification risk with the value of sharing data, leveraging a natural assumption that a recipient only attempts re-identification if its potential gains outweigh the costs. We apply the framework to a real case study, where the value of the data to the publisher is the actual grant funding dollar amounts from a national sponsor and the re-identification gain of the recipient is the fine paid to a regulator for violation of federal privacy rules. There are three notable findings: 1) it is possible to achieve zero risk, in that the recipient never gains from re-identification, while sharing almost as much data as the optimal solution that allows for a small amount of risk; 2) the zero-risk solution enables sharing much more data than a commonly invoked de-identification policy of the U.S. Health Insurance Portability and Accountability Act (HIPAA); and 3) a sensitivity analysis demonstrates these findings are robust to order-of-magnitude changes in player losses and gains. In combination, these findings provide support that such a framework can enable pragmatic policy decisions about de-identified data sharing.

Published in PLoS ONE

ISSN: 1932-6203 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine; Science
Website: https://journals.plos.org/plosone/

About the journal