Zhongguo quanke yixue (Mar 2024)
Prediction of Type 2 Diabetic Nephropathy Based on BP Neural Network Optimized by Sparrow Search Algorithm
Abstract
Background Diabetic nephropathy (DN) is one of the most common microvascular complications of diabetes, which is highly prevalent and harmful. Early detection of DN is an important task in preventing related diseases. Currently, most of the researches are based on traditional statistical prediction methods, and data need to meet the prerequisites it requires. It is necessary to try to apply new methods such as machine learning in the area of DN prediction for its failing to meet the needs in the field of DN prediction in recent years. Objective To construct DN prediction model using the LASSO regression and BP neural network optimized by sparrow search algorithm (SSA-BP) . Methods This study was conducted from April 2023 to August 2023, and the data was obtained from publicly available data on complications of 133 patients with diabetes mellitus in Iran. Univariate analysis was conducted using SPSS 26.0 software, and variables were screened using LASSO regression. Using the presence of DN as the dependent variable, the training and testing sets were divided into 8∶2 and 7∶3 ratios, respectively. The SSA-BP neural network was used for modeling and analysis, and the prediction performance was compared with classical machine learning models to analyze the better DN model. Model evaluation was performed based on accuracy, precision, sensitivity, specificity, F1-score and AUC indicators. Results Excluding 9 patients with type 1 diabetes, the effective sample size included in this study was 124 patients with type 2 diabetes mellitus (T2DM) , of which 73 (58.9%) were diagnosed with DN. Univariate analysis of risk factors for type 2 DN showed statistically significant for age, BMI, duration of diabetes, fasting blood glucose (FBG) , glycosylated hemoglobin (HbA1c) , low-density lipoprotein (LDL) , high-density lipoprotein (HDL) , triacylglycerol (TG) , systolic blood pressure (SBP) and diastolic blood pressure (DBP) (P<0.05) . When the ratio of the training set to the test set was 8∶2, there were 59 DN patients in the training set (n=100) and 14 DN patients in the test set (n=24) . Five influencing factors of age, diabetes duration, HbA1c, LDL, and SBP were obtained by LASSO regression screening. The accuracy rates of Logistic regression (LR) , K-nearest neighbor (KNN) , support vector machine (SVM) and SSA-BP models in the test set were 83.33%, 79.17%, 79.17%, 87.50%, and 95.83%, with F1-score as 0.846 2, 0.800 0, 0.800 0, 0.888 9, and 0.960 0, respectively. When the ratio of the training set to the test set was 7∶3, there were 52 DN patients in the training set (n=88) and 21 DN patients in the test set (n=36) . Seven influencing factors obtained by LASSO regression screening included age, BMI, diabetes duration, LDL, HDL, SBP, and DBP. The accuracy rates of LR, KNN, SVM, BP, and SSA-BP models in the test set were 86.11%, 86.11%, 86.11%, 72.22%, and 91.67%, with F1-score as 0.871 8, 0.871 8, 0.864 9, 0.705 9, and 0.909 1, respectively. Conclusion LR, KNN, and SVM perform better when the training set to the test set is 7∶3, while BP and SSA-BP perform better when the training set to the test set is 8∶2. Compared with the BP neural network and traditional machine learning models, SSA-BP model has the best prediction performance and can timely and accurately identify type 2 DN patients, realize early detection and treatment of DN, thus preventing and mitigating the harm to their bodies.
Keywords