Applied Network Science (Jun 2019)
The risk of node re-identification in labeled social graphs
Abstract
Abstract Real network datasets provide significant benefits for understanding phenomena such as information diffusion or network evolution. Yet the privacy risks raised from sharing real graph datasets, even when stripped of user identity information, are significant. When nodes have associated attributes, the privacy risks increase. In this paper we quantitatively study the impact of binary node attributes on node privacy by employing machine-learning-based re-identification attacks and exploring the interplay between graph topology and attribute placement. We also analyze the risk of anonymity over epidemic networks subject to different node re-identification attacks. Our experiments show that the population’s diversity on the binary attribute consistently degrades anonymity. More interestingly, we show that similar diverse populations in the SI epidemic model maintain different levels of anonymity with different infection rates.
Keywords