CPT: Pharmacometrics & Systems Pharmacology (Sep 2023)

A systematic approach towards missing lab data in electronic health records: A case study in non‐small cell lung cancer and multiple myeloma

  • Arjun Sondhi,
  • Janick Weberpals,
  • Prakirthi Yerram,
  • Chengsheng Jiang,
  • Michael Taylor,
  • Meghna Samant,
  • Sarah Cherng

DOI
https://doi.org/10.1002/psp4.12998
Journal volume & issue
Vol. 12, no. 9
pp. 1201 – 1212

Abstract

Read online

Abstract Real‐world data derived from electronic health records often exhibit high levels of missingness in variables, such as laboratory results, presenting a challenge for statistical analyses. We developed a systematic workflow for gathering evidence of different missingness mechanisms and performing subsequent statistical analyses. We quantify evidence for missing completely at random (MCAR) or missing at random (MAR), mechanisms using Hotelling's multivariate t‐test, and random forest classifiers, respectively. We further illustrate how to apply sensitivity analyses using the not at random fully conditional specification procedure to examine changes in parameter estimates under missing not at random (MNAR) mechanisms. In simulation studies, we validated these diagnostics and compared analytic bias under different mechanisms. To demonstrate the application of this workflow, we applied it to two exemplary case studies with an advanced non‐small cell lung cancer and a multiple myeloma cohort derived from a real‐world oncology database. Here, we found strong evidence against MCAR, and some evidence of MAR, implying that imputation approaches that attempt to predict missing values by fitting a model to observed data may be suitable for use. Sensitivity analyses did not suggest meaningful departures of our analytic results under potential MNAR mechanisms; these results were also in line with results reported in clinical trials.