PLoS ONE (Jan 2022)

Assessing the added value of linking electronic health records to improve the prediction of self-reported COVID-19 testing and diagnosis

  • Dylan Clark-Boucher,
  • Jonathan Boss,
  • Maxwell Salvatore,
  • Jennifer A. Smith,
  • Lars G. Fritsche,
  • Bhramar Mukherjee

Journal volume & issue
Vol. 17, no. 7

Abstract

Read online

Since the beginning of the Coronavirus Disease 2019 (COVID-19) pandemic, a focus of research has been to identify risk factors associated with COVID-19-related outcomes, such as testing and diagnosis, and use them to build prediction models. Existing studies have used data from digital surveys or electronic health records (EHRs), but very few have linked the two sources to build joint predictive models. In this study, we used survey data on 7,054 patients from the Michigan Genomics Initiative biorepository to evaluate how well self-reported data could be integrated with electronic records for the purpose of modeling COVID-19-related outcomes. We observed that among survey respondents, self-reported COVID-19 diagnosis captured a larger number of cases than the corresponding EHRs, suggesting that self-reported outcomes may be better than EHRs for distinguishing COVID-19 cases from controls. In the modeling context, we compared the utility of survey- and EHR-derived predictor variables in models of survey-reported COVID-19 testing and diagnosis. We found that survey-derived predictors produced uniformly stronger models than EHR-derived predictors—likely due to their specificity, temporal proximity, and breadth—and that combining predictors from both sources offered no consistent improvement compared to using survey-based predictors alone. Our results suggest that, even though general EHRs are useful in predictive models of COVID-19 outcomes, they may not be essential in those models when rich survey data are already available. The two data sources together may offer better prediction for COVID severity, but we did not have enough severe cases in the survey respondents to assess that hypothesis in in our study.