Biomedicine & Pharmacotherapy (May 2021)
Modeling of diagnosis for metabolic syndrome by integrating symptoms into physiochemical indexes
Abstract
Background: Metabolic syndrome (MS) is a major global health concern comprising a cluster of co-occurring conditions that increase the risk of heart disease, stroke and type 2 diabetes. MS is usually diagnosed using a combination of physiochemical indexes (such as BMI, abdominal circumference and blood pressure) but largely ignores clinical symptoms when investigating prevention and treatment of the disease. Exploring predictors of MS using multiple diagnostic indicators may improve early diagnosis and treatment of MS. Traditional Chinese medicine (TCM) attaches importance to the etiology of disease symptoms and indications using four diagnostic methods, which have long been used to treat metabolic disease. Therefore, in this study, we aimed to develop predictive indicators for MS using both physiochemical indexes and TCM methods. Methods: Clinical information (including both physiochemical and TCM indexes) was obtained from a cohort of 586 individuals across 4 hospitals in China, comprising 136 healthy controls and 450 MS cases. Using this cohort, we compared three classic machine learning methods: decision tree (DT), support vector machine (SVM) and random forest (RF) towards MS diagnosis using physiochemical and TCM indexes, with the best model selected by comparing the accuracy, specificity and sensitivity of the three models. In parallel, the best proportional partition of the training data to the test data was confirmed by observing the changes in evaluation indexes using each model. Next, three subsets containing different categories of variables (including both TCM and physicochemical indexes combined – termed the “fused indexes”, only physicochemical indexes, and TCM indexes only) were compared and analyzed using the best performing model and optimum training to test data proportion. Next, the best subset was selected through comprehensive comparative analysis, and then the important prediction variables were selected according to their weight. Results: When comparing the three models, we found that the RF model had the highest average accuracy (average 0.942, 95%CI [0.925, 0.958]) and sensitivity (average 0.993, 95%CI [0.990, 0.996]). Besides, when the training set accounted for 80% of the cohort data, the specificity got the best value and the accuracy and sensitivity were also very high in RF model. In view of the performance of the three different subsets, the prediction accuracy and sensitivity of models analyzing the fused indexes and only physicochemical indexes remained at a high level. Further, the mean value of specificity of the model using fused indexes was 0.916, which was significantly higher than the model with only physicochemical indexes (average 0.822) and the model with only TCM indexes (average 0.403). Based on the RF model and data allocation ratio (8:2), we further extracted the top 20 most significant variables from the fused indexes, which included 14 physicochemical indexes and 6 TCM indexes including wiry pulse, chest tightness, spontaneous perspiration, greasy tongue coating etc. Conclusion: Compared with SVM and DT models, the RF model showed the best performance, especially when the ratio of the training set to test set is 8:2. Compared with single predictive indexes, the model constructed by combining physiochemical indexes with TCM indexes (i.e. the fused indexes) exhibited better predictive ability. In addition to common physicochemical indexes, some TCM indexes, such as wiry pulse, chest tightness, spontaneous perspiration, greasy tongue coating, can also improve diagnosis of MS.