BMC Medical Informatics and Decision Making (Mar 2018)

Causal risk factor discovery for severe acute kidney injury using electronic health records

  • Weiqi Chen,
  • Yong Hu,
  • Xiangzhou Zhang,
  • Lijuan Wu,
  • Kang Liu,
  • Jianqin He,
  • Zilin Tang,
  • Xing Song,
  • Lemuel R. Waitman,
  • Mei Liu

DOI
https://doi.org/10.1186/s12911-018-0597-7
Journal volume & issue
Vol. 18, no. S1
pp. 57 – 63

Abstract

Read online

Abstract Background Acute kidney injury (AKI), characterized by abrupt deterioration of renal function, is a common clinical event among hospitalized patients and it is associated with high morbidity and mortality. AKI is defined in three stages with stage-3 being the most severe phase which is irreversible. It is important to effectively discover the true risk factors in order to identify high-risk AKI patients and allow better targeting of tailored interventions. However, Stage-3 AKI patients are very rare (only 0.2% of AKI patients) with a large scale of features available in EHR (1917 potential risk features), yielding a scenario unfeasible for any correlation-based feature selection or modeling method. This study aims to discover the key factors and improve the detection of Stage-3 AKI. Methods A causal discovery method (McDSL) is adopted for causal discovery to infer true causal relationship between information buried in EHR (such as medication, diagnosis, laboratory tests, comorbidities and etc.) and Stage-3 AKI risk. The research approach comprised two major phases: data collection, and causal discovery. The first phase is propose to collect the data from HER (includes 358 encounters and 891 risk factors). Finally, McDSL is employed to discover the causal risk factors of Stage-3 AKI, and five well-known machine learning models are built for predicting Stage-3 AKI with 10-fold cross-validation (predictive accuracy were measured by AUC, precision, recall and F-score). Results McDSL is useful for further research of EHR. It is able to discover four causal features, all selected features are medications that are modifiable. The latest research of machine learning is employed to compare the performance of prediction, and the experimental result has verified the selected features are pivotal. Conclusions The features selected by McDSL, which enable us to achieve significant dimension reduction without sacrificing prediction accuracy, suggesting potential clinical use such as helping physicians develop better prevention and treatment strategies.

Keywords