A Machine Learning Methodology for Diagnosing Chronic Kidney Disease

Jiongming Qin; Lin Chen; Yuhua Liu; Chuanjun Liu; Changhao Feng; Bin Chen

doi:10.1109/ACCESS.2019.2963053

IEEE Access (Jan 2020)

A Machine Learning Methodology for Diagnosing Chronic Kidney Disease

Jiongming Qin,
Lin Chen,
Yuhua Liu,
Chuanjun Liu,
Changhao Feng,
Bin Chen

Affiliations

Jiongming Qin: ORCiD; Chongqing Key Laboratory of Non-linear Circuit and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China
Lin Chen: ORCiD; Department of Electronics, Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
Yuhua Liu: ORCiD; Chongqing Key Laboratory of Non-linear Circuit and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China
Chuanjun Liu: ORCiD; Department of Electronics, Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
Changhao Feng: ORCiD; Chongqing Key Laboratory of Non-linear Circuit and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China
Bin Chen: ORCiD; Chongqing Key Laboratory of Non-linear Circuit and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China

DOI: https://doi.org/10.1109/ACCESS.2019.2963053
Journal volume & issue: Vol. 8
pp. 20991 – 21002

Abstract

Read online

Chronic kidney disease (CKD) is a global health problem with high morbidity and mortality rate, and it induces other diseases. Since there are no obvious symptoms during the early stages of CKD, patients often fail to notice the disease. Early detection of CKD enables patients to receive timely treatment to ameliorate the progression of this disease. Machine learning models can effectively aid clinicians achieve this goal due to their fast and accurate recognition performance. In this study, we propose a machine learning methodology for diagnosing CKD. The CKD data set was obtained from the University of California Irvine (UCI) machine learning repository, which has a large number of missing values. KNN imputation was used to fill in the missing values, which selects several complete samples with the most similar measurements to process the missing data for each incomplete sample. Missing values are usually seen in real-life medical situations because patients may miss some measurements for various reasons. After effectively filling out the incomplete data set, six machine learning algorithms (logistic regression, random forest, support vector machine, k-nearest neighbor, naive Bayes classifier and feed forward neural network) were used to establish models. Among these machine learning models, random forest achieved the best performance with 99.75% diagnosis accuracy. By analyzing the misjudgments generated by the established models, we proposed an integrated model that combines logistic regression and random forest by using perceptron, which could achieve an average accuracy of 99.83% after ten times of simulation. Hence, we speculated that this methodology could be applicable to more complicated clinical data for disease diagnosis.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords