Journal of Diabetes Research (Jan 2023)
A Metabolism-Based Interpretable Machine Learning Prediction Model for Diabetic Retinopathy Risk: A Cross-Sectional Study in Chinese Patients with Type 2 Diabetes
Abstract
The burden of diabetic retinopathy (DR) is increasing, and the sensitive biomarkers of the disease were not enough. Studies have found that the metabolic profile, such as amino acid (AA) and acylcarnitine (AcylCN), in the early stages of DR patients might have changed, indicating the potential of metabolites to become new biomarkers. We are amid to construct a metabolite-based prediction model for DR risk. This study was conducted on type 2 diabetes (T2D) patients with or without DR. Logistic regression and extreme gradient boosting (XGBoost) prediction models were constructed using the traditional clinical features and the screening features, respectively. Assessing the predictive power of the models in terms of both discrimination and calibration, the optimal model was interpreted using the Shapley Additive exPlanations (SHAP) to quantify the effect of features on prediction. Finally, the XGBoost model incorporating AA and AcylCN variables had the best comprehensive evaluation (ROCAUC=0.82, PRAUC=0.44, Brier score=0.09). C18 : 1OH lower than 0.04 μmol/L, C18 : 1 lower than 0.70 μmol/L, threonine higher than 27.0 μmol/L, and tyrosine lower than 36.0 μmol/L were associated with an increased risk of developing DR. Phenylalanine higher than 52.0 μmol/L was associated with a decreased risk of developing DR. In conclusion, our study mainly used AAs and AcylCNs to construct an interpretable XGBoost model to predict the risk of developing DR in T2D patients which is beneficial in identifying high-risk groups and preventing or delaying the onset of DR. In addition, our study proposed possible risk cut-off values for DR of C18 : 1OH, C18 : 1, threonine, tyrosine, and phenylalanine.