Comparison of Diagnosis Codes to Clinical Notes in Classifying Patients with Diabetic Retinopathy

Sean Yonamine, MPH; Chu Jian Ma, MD, PhD; Rolake O. Alabi, MD, PhD; Georgia Kaidonis, MBBS, PhD; Lawrence Chan, MD; Durga Borkar, MD; Joshua D. Stein, MD, MS; Benjamin F. Arnold, PhD; Catherine Q. Sun, MD

Ophthalmology Science (Nov 2024)

Comparison of Diagnosis Codes to Clinical Notes in Classifying Patients with Diabetic Retinopathy

Sean Yonamine, MPH,
Chu Jian Ma, MD, PhD,
Rolake O. Alabi, MD, PhD,
Georgia Kaidonis, MBBS, PhD,
Lawrence Chan, MD,
Durga Borkar, MD,
Joshua D. Stein, MD, MS,
Benjamin F. Arnold, PhD,
Catherine Q. Sun, MD

Affiliations

Sean Yonamine, MPH: Department of Ophthalmology, University of California, San Francisco, California; Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland
Chu Jian Ma, MD, PhD: Department of Ophthalmology, University of California, San Francisco, California
Rolake O. Alabi, MD, PhD: Department of Ophthalmology, University of California, San Francisco, California
Georgia Kaidonis, MBBS, PhD: Department of Ophthalmology, University of California, San Francisco, California
Lawrence Chan, MD: Department of Ophthalmology, University of California, San Francisco, California
Durga Borkar, MD: Department of Ophthalmology, Duke University, Durham, North Carolina
Joshua D. Stein, MD, MS: Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, Michigan
Benjamin F. Arnold, PhD: Department of Ophthalmology, University of California, San Francisco, California; F.I. Proctor Foundation, University of California, San Francisco, California; Institute for Global Health Sciences, University of California, San Francisco, California
Catherine Q. Sun, MD: Department of Ophthalmology, University of California, San Francisco, California; F.I. Proctor Foundation, University of California, San Francisco, California; Correspondence: Catherine Q. Sun, MD, Department of Ophthalmology, 490 Illinois Street, San Francisco, CA 94158.

Journal volume & issue: Vol. 4, no. 6
p. 100564

Abstract

Read online

Purpose: Electronic health records (EHRs) contain a vast amount of clinical data. Improved automated classification approaches have the potential to accurately and efficiently identify patient cohorts for research. We evaluated if a rule-based natural language processing (NLP) algorithm using clinical notes performed better for classifying proliferative diabetic retinopathy (PDR) and nonproliferative diabetic retinopathy (NPDR) severity compared with International Classification of Diseases, ninth edition (ICD-9) or 10th edition (ICD-10) codes. Design: Cross-sectional study. Subjects: Deidentified EHR data from an academic medical center identified 2366 patients aged ≥18 years, with diabetes mellitus, diabetic retinopathy (DR), and available clinical notes. Methods: From these 2366 patients, 306 random patients (100 training set, 206 test set) underwent chart review by ophthalmologists to establish the gold standard. International Classification of Diseases codes were extracted from the EHR. The notes algorithm identified positive mention of PDR and NPDR severity from clinical notes. Proliferative diabetic retinopathy and NPDR severity classification by ICD codes and the notes algorithm were compared with the gold standard. The entire DR cohort (N = 2366) was then classified as having presence (or absence) of PDR using ICD codes and the notes algorithm. Main Outcome Measures: Sensitivity, specificity, positive predictive value (PPV), negative predictive value, and F1 score for the notes algorithm compared with ICD codes using a gold standard of chart review. Results: For PDR classification of the test set patients, the notes algorithm performed better than ICD codes for all metrics. Specifically, the notes algorithm had significantly higher sensitivity (90.5% [95% confidence interval 85.7, 94.9] vs. 68.4% [60.4, 75.3]), but similar PPV (98.0% [95.4–100] vs. 94.7% [90.3, 98.3]) respectively. The F1 score was 0.941 [0.910, 0.966] for the notes algorithm compared with 0.794 [0.734, 0.842] for ICD codes. For PDR classification, ICD-10 codes performed better than ICD-9 codes (F1 score 0.836 [0.771, 0.878] vs. 0.596 [0.222, 0.692]). For NPDR severity classification, the notes algorithm performed similarly to ICD codes, but performance was limited by small sample size. Conclusions: The notes algorithm outperformed ICD codes for PDR classification. The findings demonstrate the significant potential of applying a rule-based NLP algorithm to clinical notes to increase the efficiency and accuracy of cohort selection for research. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

Published in Ophthalmology Science

ISSN: 2666-9145 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Ophthalmology
Website: https://www.journals.elsevier.com/ophthalmology-science/

About the journal

Abstract

Keywords