Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring (Jul 2024)

Develop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data

  • Xing He,
  • Ruoqi Wei,
  • Yu Huang,
  • Zhaoyi Chen,
  • Tianchen Lyu,
  • Sarah Bost,
  • Jiayi Tong,
  • Lu Li,
  • Yujia Zhou,
  • Zhao Li,
  • Jingchuan Guo,
  • Huilin Tang,
  • Fei Wang,
  • Steven DeKosky,
  • Hua Xu,
  • Yong Chen,
  • Rui Zhang,
  • Jie Xu,
  • Yi Guo,
  • Yonghui Wu,
  • Jiang Bian

DOI
https://doi.org/10.1002/dad2.12613
Journal volume & issue
Vol. 16, no. 3
pp. n/a – n/a

Abstract

Read online

Abstract INTRODUCTION Alzheimer's disease (AD) is often misclassified in electronic health records (EHRs) when relying solely on diagnosis codes. This study aimed to develop a more accurate, computable phenotype (CP) for identifying AD patients using structured and unstructured EHR data. METHODS We used EHRs from the University of Florida Health (UFHealth) system and created rule‐based CPs iteratively through manual chart reviews. The CPs were then validated using data from the University of Texas Health Science Center at Houston (UTHealth) and the University of Minnesota (UMN). RESULTS Our best‐performing CP was “patient has at least 2 AD diagnoses and AD‐related keywords in AD encounters,” with an F1‐score of 0.817 at UF, 0.961 at UTHealth, and 0.623 at UMN, respectively. DISCUSSION We developed and validated rule‐based CPs for AD identification with good performance, which will be crucial for studies that aim to use real‐world data like EHRs. Highlights Developed a computable phenotype (CP) to identify Alzheimer's disease (AD) patients using EHR data. Utilized both structured and unstructured EHR data to enhance CP accuracy. Achieved a high F1‐score of 0.817 at UFHealth, and 0.961 and 0.623 at UTHealth and UMN. Validated the CP across different demographics, ensuring robustness and fairness.

Keywords