BMC Medical Informatics and Decision Making (Jun 2024)

Undercounting diagnoses in Australian general practice: a data quality study with implications for population health reporting

  • Rachel Canaway,
  • Christine Chidgey,
  • Christine Mary Hallinan,
  • Daniel Capurro,
  • Douglas IR Boyle

DOI
https://doi.org/10.1186/s12911-024-02560-w
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Diagnosis can often be recorded in electronic medical records (EMRs) as free-text or using a term with a diagnosis code. Researchers, governments, and agencies, including organisations that deliver incentivised primary care quality improvement programs, frequently utilise coded data only and often ignore free-text entries. Diagnosis data are reported for population healthcare planning including resource allocation for patient care. This study sought to determine if diagnosis counts based on coded diagnosis data only, led to under-reporting of disease prevalence and if so, to what extent for six common or important chronic diseases. Methods This cross-sectional data quality study used de-identified EMR data from 84 general practices in Victoria, Australia. Data represented 456,125 patients who attended one of the general practices three or more times in two years between January 2021 and December 2022. We reviewed the percentage and proportional difference between patient counts of coded diagnosis entries alone and patient counts of clinically validated free-text entries for asthma, chronic kidney disease, chronic obstructive pulmonary disease, dementia, type 1 diabetes and type 2 diabetes. Results Undercounts were evident in all six diagnoses when using coded diagnoses alone (2.57–36.72% undercount), of these, five were statistically significant. Overall, 26.4% of all patient diagnoses had not been coded. There was high variation between practices in recording of coded diagnoses, but coding for type 2 diabetes was well captured by most practices. Conclusion In Australia clinical decision support and the reporting of aggregated patient diagnosis data to government that relies on coded diagnoses can lead to significant underreporting of diagnoses compared to counts that also incorporate clinically validated free-text diagnoses. Diagnosis underreporting can impact on population health, healthcare planning, resource allocation, and patient care. We propose the use of phenotypes derived from clinically validated text entries to enhance the accuracy of diagnosis and disease reporting. There are existing technologies and collaborations from which to build trusted mechanisms to provide greater reliability of general practice EMR data used for secondary purposes.

Keywords