Are ICD codes reliable for observational studies? Assessing coding consistency for data quality

Stuart J. Nelson; Ying Yin; Eduardo A. Trujillo Rivera; Yijun Shao; Phillip Ma; Mark S. Tuttle; Jennifer Garvin; Qing Zeng-Treitler

doi:10.1177/20552076241297056

Digital Health (Oct 2024)

Are ICD codes reliable for observational studies? Assessing coding consistency for data quality

Stuart J. Nelson,
Ying Yin,
Eduardo A. Trujillo Rivera,
Yijun Shao,
Phillip Ma,
Mark S. Tuttle,
Jennifer Garvin,
Qing Zeng-Treitler

Affiliations

Stuart J. Nelson: Biomedical Informatics Center, George Washington University, Washington, DC, USA
Ying Yin: Center for Data Science and Outcomes Research, , Washington, DC, USA
Eduardo A. Trujillo Rivera: Center for Data Science and Outcomes Research, , Washington, DC, USA
Yijun Shao: Center for Data Science and Outcomes Research, , Washington, DC, USA
Phillip Ma: Center for Data Science and Outcomes Research, , Washington, DC, USA
Mark S. Tuttle: , Hingham, MA, USA
Jennifer Garvin: Centers for Health Services Research, Regenstrief Institute, Inc., Indianapolis, IN, USA
Qing Zeng-Treitler: Center for Data Science and Outcomes Research, , Washington, DC, USA

DOI: https://doi.org/10.1177/20552076241297056
Journal volume & issue: Vol. 10

Abstract

Read online

Objective International Classification of Diseases (ICD) codes recorded in electronic health records (EHRs) are frequently used to create patient cohorts or define phenotypes. Inconsistent assignment of codes may reduce the utility of such cohorts. We assessed the reliability across time and location of the assignment of ICD codes in a US health system at the time of the transition from ICD-9-CM (ICD, 9th Revision, Clinical Modification) to ICD-10-CM (ICD, 10th Revision, Clinical Modification). Materials and methods Using clusters of equivalent codes derived from the US Centers for Disease Control and Prevention General Equivalence Mapping (GEM) tables, ICD assignments occurring during the ICD-9-CM to ICD-10-CM transition were investigated in EHR data from the US Veterans Administration Central Data Warehouse using deep learning and statistical models. These models were then used to detect abrupt changes across the transition; additionally, changes at each VA station were examined. Results Many of the 687 most-used code clusters had ICD-10-CM assignments differing greatly from that predicted from the codes used in ICD-9-CM. Manual reviews of a random sample found that 66% of the clusters showed problematic changes, with 37% having no apparent explanations. Notably, the observed pattern of changes varied widely across care locations. Discussion and conclusion The observed coding variability across time and across location suggests that ICD codes in EHRs are insufficient to establish a semantically reliable cohort or phenotype. While some variations might be expected with a changing in coding structure, the inconsistency across locations suggests other difficulties. Researchers should consider carefully how cohorts and phenotypes of interest are selected and defined.

Published in Digital Health

ISSN: 2055-2076 (Online)
Publisher: SAGE Publishing
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://journals.sagepub.com/home/dhj

About the journal