Scientific Data (Jun 2024)
A study on interoperability between two Personal Health Train infrastructures in leukodystrophy data analysis
Abstract
Abstract The development of platforms for distributed analytics has been driven by a growing need to comply with various governance-related or legal constraints. Among these platforms, the so-called Personal Health Train (PHT) is one representative that has emerged over the recent years. However, in projects that require data from sites featuring different PHT infrastructures, institutions are facing challenges emerging from the combination of multiple PHT ecosystems, including data governance, regulatory compliance, or the modification of existing workflows. In these scenarios, the interoperability of the platforms is preferable. In this work, we introduce a conceptual framework for the technical interoperability of the PHT covering five essential requirements: Data integration, unified station identifiers, mutual metadata, aligned security protocols, and business logic. We evaluated our concept in a feasibility study that involves two distinct PHT infrastructures: PHT-meDIC and PADME. We analyzed data on leukodystrophy from patients in the University Hospitals of Tübingen and Leipzig, and patients with differential diagnoses at the University Hospital Aachen. The results of our study demonstrate the technical interoperability between these two PHT infrastructures, allowing researchers to perform analyses across the participating institutions. Our method is more space-efficient compared to the multi-homing strategy, and it shows only a minimal time overhead.