Journal of Translational Medicine (Oct 2022)

Record linkage based patient intersection cardinality for rare disease studies using Mainzelliste and secure multi-party computation

  • Tobias Kussel,
  • Torben Brenner,
  • Galina Tremper,
  • Josef Schepers,
  • Martin Lablans,
  • Kay Hamacher

DOI
https://doi.org/10.1186/s12967-022-03671-6
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background The low number of patients suffering from any given rare diseases poses a difficult problem for medical research: With the exception of some specialized biobanks and disease registries, potential study participants’ information are disjoint and distributed over many medical institutions. Whenever some of those facilities are in close proximity, a significant overlap of patients can reasonably be expected, further complicating statistical study feasibility assessments and data gathering. Due to the sensitive nature of medical records and identifying data, data transfer and joint computations are often forbidden by law or associated with prohibitive amounts of effort. To alleviate this problem and to support rare disease research, we developed the Mainzelliste Secure EpiLinker (MainSEL) record linkage framework, a secure Multi-Party Computation based application using trusted-third-party-less cryptographic protocols to perform privacy-preserving record linkage with high security guarantees. In this work, we extend MainSEL to allow the record linkage based calculation of the number of common patients between institutions. This allows privacy-preserving statistical feasibility estimations for further analyses and data consolidation. Additionally, we created easy to deploy software packages using microservice containerization and continuous deployment/continuous integration. We performed tests with medical researchers using MainSEL in real-world medical IT environments, using synthetic patient data. Results We show that MainSEL achieves practical runtimes, performing 10 000 comparisons in approximately 5 minutes. Our approach proved to be feasible in a wide range of network settings and use cases. The “lessons learned” from the real-world testing show the need to explicitly support and document the usage and deployment for both analysis pipeline integration and researcher driven ad-hoc analysis use cases, thus clarifying the wide applicability of our software. MainSEL is freely available under: https://github.com/medicalinformatics/MainSEL Conclusions MainSEL performs well in real-world settings and is a useful tool not only for rare disease research, but medical research in general. It achieves practical runtimes, improved security guarantees compared to existing solutions, and is simple to deploy in strict clinical IT environments. Based on the “lessons learned” from the real-word testing, we hope to enable a wide range of medical researchers to meet their needs and requirements using modern privacy-preserving technologies.

Keywords