International Journal of Population Data Science (Dec 2020)
Overcoming the Impasse 1: Reporting on Recent Australian Experiences in The Use of Privacy-Preserving Record Linkage Methods Using Bloom Filters
Abstract
Introduction Notwithstanding the growth in the number and type of datasets that are being included in data linkage projects, some datasets remain ‘hard to include’ in operational linkage systems. Legal or regulatory constraints often restrict the release of personally identifying information from some datasets; alternatively, it may be privacy or reputational risks that prevent data release. Advances in privacy-preserving record linkage (PPRL) methods have made it possible to overcome this impasse. Objectives and Approach We present and describe a number of recent Australian ‘use cases’ where the PPRL-Bloom method has been used. For each, we report and reflect on the following: a) The nature of problem or ‘impasse’ being solved b) The linkage model adopted c) Quality, performance, privacy and automation d) Challenges, insights and opportunities for improvement Results Australian projects utilising privacy preserving linkage (PPRL-Bloom) include several linking state-based datasets to Commonwealth datasets, some linking primary care data to state-based hospital and other health collections, and others linking state-based non-health datasets such as education, police and justice datasets. Conclusion / Implications PPRL is a useful and innovative methodology for providing access to some ‘hard to get’ datasets. It has already enabled a number of research projects that for regulatory or other privacy related reasons would not have occurred. The use of PPRL in Australia appears likely to grow.