International Journal of Population Data Science (Aug 2018)

Developing a Primary Care EMR-based Frailty Definition using Machine Learning

  • Sylvia Aponte-Hao,
  • Bria Mele,
  • Dave Jackson,
  • Alan Katz,
  • Charles Leduc,
  • Brendan Lethebe,
  • Sabrina Wong,
  • Tyler Williamson

DOI
https://doi.org/10.23889/ijpds.v3i4.811
Journal volume & issue
Vol. 3, no. 4

Abstract

Read online

Introduction Frailty is a geriatric syndrome that is predictive of heightened vulnerability for disability, hospitalization, and mortality. Annually an estimated 250,000 frail Canadians die, and this estimate is expected to double in the next 40 years, as Canadians grow older. Currently there is no single accepted clinical definition of frailty. Objectives and Approach The objective of this study was to develop an operational definition of frailty using machine learning that can be applied to a primary care electronic medical record (EMR) database. The Canadian Primary Care Sentinel Surveillance Network (CPCSSN) is a pan-Canadian network of primary care practices that collect de-identified patient information (such as encounter diagnoses, health conditions, and laboratory data) from EMRs. 780 patients from CPCSSN have were randomly selected and assessed by physicians using the Rockwood Clinical Frailty Scale (as frail or not frail), and their clinical characteristics from CPCSSN used to develop the definition using machine-learning. Results A total of 8,044 clinical features were extracted from these tables: billing, problem list, encounter diagnosis, labs, medications and referrals. A chi-squared automatic interaction detector (CHAID) approach was selected as the best approach. The bootstrapping process used a cost matrix that prioritized high sensitivity and positive predictive value. 10-fold cross validation was used for validity measures. Key features factored into the algorithm included: diagnosis of dementia (ICD-9 code 290), medications furosemide and vitamins, and use of key word “obstruction” within the billing table. The validation measures with 95% confidence intervals are as follows: sensitivity of 28% (95% CI: 21% to 36%), specificity of 94% (95% CI: 93% to 96%), positive predictive value of 53% (95% CI: 42% to 64%), negative predictive value of 86% (95% CI: 83% to 88%). Conclusion/Implications No other primary care specific frailty screening tools have sufficient validity. These results suggest heterogeneous diseases require clearly defined features and potentially more sophisticated algorithms to account for heterogeneity. Further research utilizing continuous features and continuous frailty scores may be more suitable in the creation of a case detection algorithm.