International Journal of Population Data Science (Apr 2024)

Determining households from patient addresses and unique property reference numbers in general practitioner electronic health records

  • Gillian Harper,
  • Nicola Firman,
  • Marta Wilk,
  • Milena Marszalek,
  • Paul Simon,
  • David Stables,
  • Rich Fry,
  • Kelvin Smith,
  • Carol Dezateux

DOI
https://doi.org/10.23889/ijpds.v9i1.2379
Journal volume & issue
Vol. 9, no. 1

Abstract

Read online

Introduction Households are increasingly studied in population health research as an important context for understanding health and social behaviours and outcomes. Identifying household units of analysis in routinely collected data rather than traditional surveys requires innovative and standardised tools, which do not currently exist. Objectives To design a utility that identifies households at a point in time from pseudonymised Unique Property Reference Numbers (UPRNs) known as Residential Anonymised Linkage Fields (RALFs) assigned to general practitioner (GP) patient addresses in electronic health records (EHRs) in north east London (NEL). Methods Rule-based logic was developed to identify households based on GP registration, address date, and RALF validity. The logic was tested on a use case on the household clustering of childhood weight status, and bias in success of identifying households was examined in the use case cohort and in a full population cohort. Results 92.1% of the use case cohort was assigned a household. The most frequent dominant reason (55.3%) for a household not assigned was that a person had no valid household RALFs available across their patient registration address records. Other reasons are having none or multiple valid household RALFs, or not being alive at the event date. In the use case, children not assigned to a household were more likely to attend schools in City & Hackney and living in the third most deprived quintile of lower super output areas. 88.9% of the population cohort was assigned a household. Patients not assigned to a household were more likely to be aged 18 to 45 years, living in City & Hackney, and living in the second quintile of most deprived lower super output areas. Conclusions We have developed a method for deriving households from primary care EHRs that can be implemented quickly and in real-time, providing timely data to support population health research on households.

Keywords