How far back do we need to look to capture diagnoses in electronic health records? A retrospective observational study of hospital electronic health record data
,
Tom Marshall,
Elizabeth Sapey,
Miles D Witham,
Steve Harris,
Rachel Cooper,
Chris Plummer,
Felicity Evison,
James Wason,
Heather J Cordell,
Suzy Gallier,
Fiona E Matthews,
Ewan Pearson,
Avan A Sayer,
Mervyn Singer,
Joanne Field,
Mohammed Osman,
Sian Robinson,
Victoria Bartle,
Thomas Scharf,
Jadene Lewis,
Rominique Doal,
Peta le Roux,
Ray Holding,
Paolo Missier
Affiliations
11 Kenya National Bureau of Statistics, Nairobi, Nairobi, Kenya
Tom Marshall
2 Institute of Applied Health Research, University of Birmingham, Birmingham, UK
Elizabeth Sapey
1 Institute of Inflammation and Ageing, University of Birmingham College of Medical and Dental Sciences, Birmingham, UK
Miles D Witham
AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
Steve Harris
Critical Care Department, University College London Hospitals NHS Foundation Trust, London, UK
Rachel Cooper
AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
Chris Plummer
Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
Felicity Evison
Data Science Team, Research Development and Innovation, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
James Wason
MRC Biostatistics Unit, University of Cambridge, UK
Heather J Cordell
Suzy Gallier
PIONEER Health Data Research Hub, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
Fiona E Matthews
Ewan Pearson
University of Dundee, Dundee, UK
Avan A Sayer
AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
Mervyn Singer
Critical Care Department, University College London Hospitals NHS Foundation Trust, London, UK
Joanne Field
Genomics and Molecular Medicine Service, Nottingham University Hospitals NHS Trust, Nottingham, UK
Mohammed Osman
AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
Sian Robinson
Victoria Bartle
Thomas Scharf
Jadene Lewis
PIONEER Hub, University of Birmingham, Birmingham, UK
Rominique Doal
PIONEER Hub, University of Birmingham, Birmingham, UK
Peta le Roux
Digital Services, Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
Objectives Analysis of routinely collected electronic health data is a key tool for long-term condition research and practice for hospitalised patients. This requires accurate and complete ascertainment of a broad range of diagnoses, something not always recorded on an admission document at a single point in time. This study aimed to ascertain how far back in time electronic hospital records need to be interrogated to capture long-term condition diagnoses.Design Retrospective observational study of routinely collected hospital electronic health record data.Setting Queen Elizabeth Hospital Birmingham (UK)-linked data held by the PIONEER acute care data hub.Participants Patients whose first recorded admission for chronic obstructive pulmonary disease (COPD) exacerbation (n=560) or acute stroke (n=2142) was between January and December 2018 and who had a minimum of 10 years of data prior to the index date.Outcome measures We identified the most common International Classification of Diseases version 10-coded diagnoses received by patients with COPD and acute stroke separately. For each diagnosis, we derived the number of patients with the diagnosis recorded at least once over the full 10-year lookback period, and then compared this with shorter lookback periods from 1 year to 9 years prior to the index admission.Results Seven of the top 10 most common diagnoses in the COPD dataset reached >90% completeness by 6 years of lookback. Atrial fibrillation and diabetes were >90% coded with 2–3 years of lookback, but hypertension and asthma completeness continued to rise all the way out to 10 years of lookback. For stroke, 4 of the top 10 reached 90% completeness by 5 years of lookback; angina pectoris was >90% coded at 7 years and previous transient ischaemic attack completeness continued to rise out to 10 years of lookback.Conclusion A 7-year lookback captures most, but not all, common diagnoses. Lookback duration should be tailored to the conditions being studied.