BMC Medical Research Methodology (Apr 2024)

Overcoming denominator problems in refugee settings with fragmented electronic records for health and immigration data: a prediction-based approach

  • Stella Erdmann,
  • Rosa Jahn,
  • Sven Rohleder,
  • Kayvan Bozorgmehr

DOI
https://doi.org/10.1186/s12874-024-02204-7
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background Epidemiological studies in refugee settings are often challenged by the denominator problem, i.e. lack of population at risk data. We develop an empirical approach to address this problem by assessing relationships between occupancy data in refugee centres, number of refugee patients in walk-in clinics, and diseases of the digestive system. Methods Individual-level patient data from a primary care surveillance system (PriCarenet) was matched with occupancy data retrieved from immigration authorities. The three relationships were analysed using regression models, considering age, sex, and type of centre. Then predictions for the respective data category not available in each of the relationships were made. Twenty-one German on-site health care facilities in state-level registration and reception centres participated in the study, covering the time period from November 2017 to July 2021. Results 445 observations (“centre-months”) for patient data from electronic health records (EHR, 230 mean walk-in clinics visiting refugee patients per month and centre; standard deviation sd: 202) of a total of 47.617 refugee patients were available, 215 for occupancy data (OCC, mean occupancy of 348 residents, sd: 287), 147 for both (matched), leaving 270 observations without occupancy (EHR-unmatched) and 40 without patient data (OCC-unmatched). The incidence of diseases of the digestive system, using patients as denominators in the different sub-data sets were 9.2% (sd: 5.9) in EHR, 8.8% (sd: 5.1) when matched, 9.6% (sd: 6.4) in EHR- and 12% (sd 2.9) in OCC-unmatched. Using the available or predicted occupancy as denominator yielded average incidence estimates (per centre and month) of 4.7% (sd: 3.2) in matched data, 4.8% (sd: 3.3) in EHR- and 7.4% (sd: 2.7) in OCC-unmatched. Conclusions By modelling the ratio between patient and occupancy numbers in refugee centres depending on sex and age, as well as on the total number of patients or occupancy, the denominator problem in health monitoring systems could be mitigated. The approach helped to estimate the missing component of the denominator, and to compare disease frequency across time and refugee centres more accurately using an empirically grounded prediction of disease frequency based on demographic and centre typology. This avoided over-estimation of disease frequency as opposed to the use of patients as denominators.

Keywords