BMC Medical Genomics (Oct 2018)

Privacy-preserving record linkage in large databases using secure multiparty computation

  • Peeter Laud,
  • Alisa Pankova

DOI
https://doi.org/10.1186/s12920-018-0400-8
Journal volume & issue
Vol. 11, no. S4
pp. 33 – 46

Abstract

Read online

Abstract Background Practical applications for data analysis may require combining multiple databases belonging to different owners, such as health centers. The analysis should be performed without violating privacy of neither the centers themselves, nor the patients whose records these centers store. To avoid biased analysis results, it may be important to remove duplicate records among the centers, so that each patient’s data would be taken into account only once. This task is very closely related to privacy-preserving record linkage. Methods This paper presents a solution to privacy-preserving deduplication among records of several databases using secure multiparty computation. It is build upon one of the fastest practical secure multiparty computation platforms, called Sharemind. Results The tests on ca 10 million records of simulated databases with 1000 health centers of 10000 records each show that the computation is feasible in practice. The expected running time of the experiment is ca. 30 min for computing servers connected over 100 Mbit/s WAN, the expected error of the results is 2−40, and no errors have been detected for the particular test set that we used for our benchmarks. Conclusions The solution is ready for practical use. It has well-defined security properties, implied by the properties of Sharemind platform. The solution assumes that exact matching of records is required, and a possible future research would be extending it to approximate matching.

Keywords