PLoS ONE (Jan 2023)

Factors associated with resistance to SARS-CoV-2 infection discovered using large-scale medical record data and machine learning.

  • Kai-Wen K Yang,
  • Chloé F Paris,
  • Kevin T Gorman,
  • Ilia Rattsev,
  • Rebecca H Yoo,
  • Yijia Chen,
  • Jacob M Desman,
  • Tony Y Wei,
  • Joseph L Greenstein,
  • Casey Overby Taylor,
  • Stuart C Ray

DOI
https://doi.org/10.1371/journal.pone.0278466
Journal volume & issue
Vol. 18, no. 2
p. e0278466

Abstract

Read online

There have been over 621 million cases of COVID-19 worldwide with over 6.5 million deaths. Despite the high secondary attack rate of COVID-19 in shared households, some exposed individuals do not contract the virus. In addition, little is known about whether the occurrence of COVID-19 resistance differs among people by health characteristics as stored in the electronic health records (EHR). In this retrospective analysis, we develop a statistical model to predict COVID-19 resistance in 8,536 individuals with prior COVID-19 exposure using demographics, diagnostic codes, outpatient medication orders, and count of Elixhauser comorbidities in EHR data from the COVID-19 Precision Medicine Platform Registry. Cluster analyses identified 5 patterns of diagnostic codes that distinguished resistant from non-resistant patients in our study population. In addition, our models showed modest performance in predicting COVID-19 resistance (best performing model AUROC = 0.61). Monte Carlo simulations conducted indicated that the AUROC results are statistically significant (p < 0.001) for the testing set. We hope to validate the features found to be associated with resistance/non-resistance through more advanced association studies.