Develop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data

Xing He; Ruoqi Wei; Yu Huang; Zhaoyi Chen; Tianchen Lyu; Sarah Bost; Jiayi Tong; Lu Li; Yujia Zhou; Zhao Li; Jingchuan Guo; Huilin Tang; Fei Wang; Steven DeKosky; Hua Xu; Yong Chen; Rui Zhang; Jie Xu; Yi Guo; Yonghui Wu; Jiang Bian

doi:10.1002/dad2.12613

Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring (Jul 2024)

Develop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data

Xing He,
Ruoqi Wei,
Yu Huang,
Zhaoyi Chen,
Tianchen Lyu,
Sarah Bost,
Jiayi Tong,
Lu Li,
Yujia Zhou,
Zhao Li,
Jingchuan Guo,
Huilin Tang,
Fei Wang,
Steven DeKosky,
Hua Xu,
Yong Chen,
Rui Zhang,
Jie Xu,
Yi Guo,
Yonghui Wu,
Jiang Bian

Affiliations

Xing He: Department of Health Outcomes & Biomedical Informatics College of Medicine University of Florida Gainesville Florida USA
Ruoqi Wei: Department of Health Outcomes & Biomedical Informatics College of Medicine University of Florida Gainesville Florida USA
Yu Huang: Department of Health Outcomes & Biomedical Informatics College of Medicine University of Florida Gainesville Florida USA
Zhaoyi Chen: Center for Biomedical Informatics & Information Technology National Cancer Institute Rockville Maryland USA
Tianchen Lyu: Department of Health Outcomes & Biomedical Informatics College of Medicine University of Florida Gainesville Florida USA
Sarah Bost: Department of Health Outcomes & Biomedical Informatics College of Medicine University of Florida Gainesville Florida USA
Jiayi Tong: Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Philadelphia Pennsylvania USA
Lu Li: Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Philadelphia Pennsylvania USA
Yujia Zhou: Biomedical Informatics and Data Science School of Medicine, Yale New Haven Connecticut USA
Zhao Li: School of Biomedical Informatics University of Texas Health Science Center at Houston Houston Texas USA
Jingchuan Guo: Department of Pharmaceutical Outcomes and Policy College of Pharmacy University of Florida Gainesville Florida USA
Huilin Tang: Department of Pharmaceutical Outcomes and Policy College of Pharmacy University of Florida Gainesville Florida USA
Fei Wang: Department of Population Health Sciences Weill Cornell Medicine New York New York USA
Steven DeKosky: Department of Neurology College of Medicine University of Florida Gainesville Florida USA
Hua Xu: Biomedical Informatics and Data Science School of Medicine, Yale New Haven Connecticut USA
Yong Chen: Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Philadelphia Pennsylvania USA
Rui Zhang: Division of Computational Health Sciences Department of Surgery University of Minnesota Minneapolis Minnesota USA
Jie Xu: Department of Health Outcomes & Biomedical Informatics College of Medicine University of Florida Gainesville Florida USA
Yi Guo: Department of Health Outcomes & Biomedical Informatics College of Medicine University of Florida Gainesville Florida USA
Yonghui Wu: Department of Health Outcomes & Biomedical Informatics College of Medicine University of Florida Gainesville Florida USA
Jiang Bian: Department of Health Outcomes & Biomedical Informatics College of Medicine University of Florida Gainesville Florida USA

DOI: https://doi.org/10.1002/dad2.12613
Journal volume & issue: Vol. 16, no. 3
pp. n/a – n/a

Abstract

Read online

Abstract INTRODUCTION Alzheimer's disease (AD) is often misclassified in electronic health records (EHRs) when relying solely on diagnosis codes. This study aimed to develop a more accurate, computable phenotype (CP) for identifying AD patients using structured and unstructured EHR data. METHODS We used EHRs from the University of Florida Health (UFHealth) system and created rule‐based CPs iteratively through manual chart reviews. The CPs were then validated using data from the University of Texas Health Science Center at Houston (UTHealth) and the University of Minnesota (UMN). RESULTS Our best‐performing CP was “patient has at least 2 AD diagnoses and AD‐related keywords in AD encounters,” with an F1‐score of 0.817 at UF, 0.961 at UTHealth, and 0.623 at UMN, respectively. DISCUSSION We developed and validated rule‐based CPs for AD identification with good performance, which will be crucial for studies that aim to use real‐world data like EHRs. Highlights Developed a computable phenotype (CP) to identify Alzheimer's disease (AD) patients using EHR data. Utilized both structured and unstructured EHR data to enhance CP accuracy. Achieved a high F1‐score of 0.817 at UFHealth, and 0.961 and 0.623 at UTHealth and UMN. Validated the CP across different demographics, ensuring robustness and fairness.

Published in Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring

ISSN: 2352-8729 (Online)
Publisher: Wiley
Country of publisher: United States
LCC subjects: Medicine: Internal medicine: Neurosciences. Biological psychiatry. Neuropsychiatry: Neurology. Diseases of the nervous system; Medicine: Internal medicine: Special situations and conditions: Geriatrics
Website: https://alz-journals.onlinelibrary.wiley.com/journal/23528729

About the journal

Abstract

Keywords