Scientific Reports (Sep 2022)
Construct a classification decision tree model to select the optimal equation for estimating glomerular filtration rate and estimate it more accurately
Abstract
Abstract Chronic kidney disease (CKD) has become a worldwide public health problem and accurate assessment of renal function in CKD patients is important for the treatment. Although the glomerular filtration rate (GFR) can accurately evaluate the renal function, the procedure of measurement is complicated. Therefore, endogenous markers are often chosen to estimate GFR indirectly. However, the accuracy of the equations for estimating GFR is not optimistic. To estimate GFR more precisely, we constructed a classification decision tree model to select the most befitting GFR estimation equation for CKD patients. By searching the HIS system of the First Affiliated Hospital of Zhejiang Chinese Medicine University for all CKD patients who visited the hospital from December 1, 2018 to December 1, 2021 and underwent Gate’s method of 99mTc-DTPA renal dynamic imaging to detect GFR, we eventually collected 518 eligible subjects, who were randomly divided into a training set (70%, 362) and a test set (30%, 156). Then, we used the training set data to build a classification decision tree model that would choose the most accurate equation from the four equations of BIS-2, CKD-EPI(CysC), CKD-EPI(Cr-CysC) and Ruijin, and the equation was selected by the model to estimate GFR. Next, we utilized the test set data to verify our tree model, and compared the GFR estimated by the tree model with other 13 equations. Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Bland–Altman plot were used to evaluate the accuracy of the estimates by different methods. A classification decision tree model, including BSA, BMI, 24-hour Urine protein quantity, diabetic nephropathy, age and RASi, was eventually retrieved. In the test set, the RMSE and MAE of GFR estimated by the classification decision tree model were 12.2 and 8.5 respectively, which were lower than other GFR estimation equations. According to Bland–Altman plot of patients in the test set, the eGFR was calculated based on this model and had the smallest degree of variation. We applied the classification decision tree model to select an appropriate GFR estimation equation for CKD patients, and the final GFR estimation was based on the model selection results, which provided us with greater accuracy in GFR estimation.