Scientific Reports (Dec 2022)

Detection of factors affecting kidney function using machine learning methods

  • Arezoo Haratian,
  • Zeinab Maleki,
  • Farzaneh Shayegh,
  • Alireza Safaeian

DOI
https://doi.org/10.1038/s41598-022-26160-8
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Due to the increasing prevalence of chronic kidney disease and its high mortality rate, study of risk factors affecting the progression of the disease is of great importance. Here in this work, we aim to develop a framework for using machine learning methods to identify factors affecting kidney function. To this end classification methods are trained to predict the serum creatinine level based on numerical values of other blood test parameters in one of the three classes representing different ranges of the variable values. Models are trained using the data from blood test results of healthy and patient subjects including 46 different blood test parameters. The best developed models are random forest and LightGBM. Interpretation of the resulting model reveals a direct relationship between vitamin D and blood creatinine level. The detected analogy between these two parameters is reliable, regarding the relatively high predictive accuracy of the random forest model reaching the AUC of 0.90 and the accuracy of 0.74. Moreover, in this paper we develop a Bayesian network to infer the direct relationships between blood test parameters which have consistent results with the classification models. The proposed framework uses an inclusive set of advanced imputation methods to deal with the main challenge of working with electronic health data, missing values. Hence it can be applied to similar clinical studies to investigate and discover the relationships between the factors under study.