npj Digital Medicine (Apr 2021)

Medical records-based chronic kidney disease phenotype for clinical care and “big data” observational and genetic studies

  • Ning Shang,
  • Atlas Khan,
  • Fernanda Polubriaginof,
  • Francesca Zanoni,
  • Karla Mehl,
  • David Fasel,
  • Paul E. Drawz,
  • Robert J. Carrol,
  • Joshua C. Denny,
  • Matthew A. Hathcock,
  • Adelaide M. Arruda-Olson,
  • Peggy L. Peissig,
  • Richard A. Dart,
  • Murray H. Brilliant,
  • Eric B. Larson,
  • David S. Carrell,
  • Sarah Pendergrass,
  • Shefali Setia Verma,
  • Marylyn D. Ritchie,
  • Barbara Benoit,
  • Vivian S. Gainer,
  • Elizabeth W. Karlson,
  • Adam S. Gordon,
  • Gail P. Jarvik,
  • Ian B. Stanaway,
  • David R. Crosslin,
  • Sumit Mohan,
  • Iuliana Ionita-Laza,
  • Nicholas P. Tatonetti,
  • Ali G. Gharavi,
  • George Hripcsak,
  • Chunhua Weng,
  • Krzysztof Kiryluk

DOI
https://doi.org/10.1038/s41746-021-00428-1
Journal volume & issue
Vol. 4, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Chronic Kidney Disease (CKD) represents a slowly progressive disorder that is typically silent until late stages, but early intervention can significantly delay its progression. We designed a portable and scalable electronic CKD phenotype to facilitate early disease recognition and empower large-scale observational and genetic studies of kidney traits. The algorithm uses a combination of rule-based and machine-learning methods to automatically place patients on the staging grid of albuminuria by glomerular filtration rate (“A-by-G” grid). We manually validated the algorithm by 451 chart reviews across three medical systems, demonstrating overall positive predictive value of 95% for CKD cases and 97% for healthy controls. Independent case-control validation using 2350 patient records demonstrated diagnostic specificity of 97% and sensitivity of 87%. Application of the phenotype to 1.3 million patients demonstrated that over 80% of CKD cases are undetected using ICD codes alone. We also demonstrated several large-scale applications of the phenotype, including identifying stage-specific kidney disease comorbidities, in silico estimation of kidney trait heritability in thousands of pedigrees reconstructed from medical records, and biobank-based multicenter genome-wide and phenome-wide association studies.