Chinese Medicine (May 2024)

Development of an interpretable machine learning model associated with genetic indicators to identify Yin-deficiency constitution

  • Jing Li,
  • Yingying Zhai,
  • Yanqi Cao,
  • Yifan Xia,
  • Ruoxi Yu

DOI
https://doi.org/10.1186/s13020-024-00941-x
Journal volume & issue
Vol. 19, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background Traditional Chinese Medicine (TCM) defines constitutions which are relevant to corresponding diseases among people. As one of the common constitutions, Yin-deficiency constitution influences a number of Chinese population in the disease onset. Therefore, accurate Yin-deficiency constitution identification is significant for disease prevention and treatment. Methods In this study, we collected participants with Yin-deficiency constitution and balanced constitution, separately. The least absolute shrinkage and selection operator (LASSO) and logistic regression were used to analyze genetic predictors. Four machine learning models for Yin-deficiency constitution classification with multiple combined genetic indicators were integrated to analyze and identify the optimal model and features. The Shapley Additive exPlanations (SHAP) interpretation was developed for model explanation. Results The results showed that, NFKBIA, BCL2A1 and CCL4 were the most associated genetic indicators with Yin-deficiency constitution. Random forest with three genetic predictors including NFKBIA, BCL2A1 and CCL4 was the optimal model, area under curve (AUC): 0.937 (95% CI 0.844–1.000), sensitivity: 0.870, specificity: 0.900. The SHAP method provided an intuitive explanation of risk leading to individual predictions. Conclusion We constructed a Yin-deficiency constitution classification model based on machine learning and explained it with the SHAP method, providing an objective Yin-deficiency constitution identification system in TCM and the guidance for clinicians.

Keywords