Electronic Health Record Data Quality and Performance Assessments: Scoping Review

Yordan P Penev; Timothy R Buchanan; Matthew M Ruppert; Michelle Liu; Ramin Shekouhi; Ziyuan Guan; Jeremy Balch; Tezcan Ozrazgat-Baslanti; Benjamin Shickel; Tyler J Loftus; Azra Bihorac

doi:10.2196/58130

JMIR Medical Informatics (Nov 2024)

Electronic Health Record Data Quality and Performance Assessments: Scoping Review

Yordan P Penev,
Timothy R Buchanan,
Matthew M Ruppert,
Michelle Liu,
Ramin Shekouhi,
Ziyuan Guan,
Jeremy Balch,
Tezcan Ozrazgat-Baslanti,
Benjamin Shickel,
Tyler J Loftus,
Azra Bihorac

Affiliations

Yordan P Penev: ORCiD
Timothy R Buchanan: ORCiD
Matthew M Ruppert: ORCiD
Michelle Liu: ORCiD
Ramin Shekouhi: ORCiD
Ziyuan Guan: ORCiD
Jeremy Balch: ORCiD
Tezcan Ozrazgat-Baslanti: ORCiD
Benjamin Shickel: ORCiD
Tyler J Loftus: ORCiD
Azra Bihorac: ORCiD

DOI: https://doi.org/10.2196/58130
Journal volume & issue: Vol. 12
pp. e58130 – e58130

Abstract

Read online

Abstract BackgroundElectronic health records (EHRs) have an enormous potential to advance medical research and practice through easily accessible and interpretable EHR-derived databases. Attainability of this potential is limited by issues with data quality (DQ) and performance assessment. ObjectiveThis review aims to streamline the current best practices on EHR DQ and performance assessments as a replicable standard for researchers in the field. MethodsPubMed was systematically searched for original research articles assessing EHR DQ and performance from inception until May 7, 2023. ResultsOur search yielded 26 original research articles. Most articles had 1 or more significant limitations, including incomplete or inconsistent reporting (n=6, 30%), poor replicability (n=5, 25%), and limited generalizability of results (n=5, 25%). Completeness (n=21, 81%), conformance (n=18, 69%), and plausibility (n=16, 62%) were the most cited indicators of DQ, while correctness or accuracy (n=14, 54%) was most cited for data performance, with context-specific supplementation by recency (n=7, 27%), fairness (n=6, 23%), stability (n=4, 15%), and shareability (n=2, 8%) assessments. Artificial intelligence–based techniques, including natural language data extraction, data imputation, and fairness algorithms, were demonstrated to play a rising role in improving both dataset quality and performance. ConclusionsThis review highlights the need for incentivizing DQ and performance assessments and their standardization. The results suggest the usefulness of artificial intelligence–based techniques for enhancing DQ and performance to unlock the full potential of EHRs to improve medical research and practice.

Published in JMIR Medical Informatics

ISSN: 2291-9694 (Online)
Publisher: JMIR Publications
Country of publisher: Canada
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://medinform.jmir.org

About the journal