Journal of Medical Internet Research (Mar 2023)
A Platform for Data-Centric, Continuous Epidemiological Analyses (EpiGraphHub): Descriptive Analysis
Abstract
BackgroundGuaranteeing durability, provenance, accessibility, and trust in open data sets can be challenging for researchers and organizations that rely on public repositories of data critical for epidemiology and other health analytics. The required data repositories are often difficult to locate and may require conversion to a standard data format. Data-hosting websites may also change or become unavailable without warning. A single change to the rules in one repository can hinder updating a public dashboard reliant on data pulled from external sources. These concerns are particularly challenging at the international level, because policies on systems aimed at harmonizing health and related data are typically dictated by national governments to serve their individual needs. ObjectiveIn this paper, we introduce a comprehensive public health data platform, EpiGraphHub, that aims to provide a single interoperable repository for open health and related data. MethodsThe platform, curated by the international research community, allows secure local integration of sensitive data while facilitating the development of data-driven applications and reports for decision-makers. Its main components include centrally managed databases with fine-grained access control to data, fully automated and documented data collection and transformation, and a powerful web-based data exploration and visualization tool. ResultsEpiGraphHub is already being used for hosting a growing collection of open data sets and for automating epidemiological analyses based on them. The project has also released an open-source software library with the analytical methods used in the platform. ConclusionsThe platform is fully open source and open to external users. It is in active development with the goal of maximizing its value for large-scale public health studies.