Lipids in Health and Disease (Jan 2024)
Risk factor analysis and risk prediction study of obesity in steelworkers: model development based on an occupational health examination cohort dataset
Abstract
Abstract Background Obesity is increasingly recognized as a grave public health concern globally. It is associated with prevalent diseases including coronary heart disease, fatty liver, type 2 diabetes, and dyslipidemia. Prior research has identified demographic, socioeconomic, lifestyle, and genetic factors as contributors to obesity. Nevertheless, the influence of occupational risk factors on obesity among workers remains under-explored. Investigating risk factors specific to steelworkers is crucial for early detection, prediction, and effective intervention, thereby safeguarding their health. Methods This research utilized a cohort study examining health impacts on workers in an iron and steel company in Hebei Province, China. The study involved 5469 participants. By univariate analysis, multifactor analysis, and review of relevant literature, predictor variables were found. Three predictive models—XG Boost, Support Vector Machine (SVM), and Random Forest (RF)—were employed. Results Univariate analysis and cox proportional hazard regression modeling identified age, gender, smoking and drinking habits, dietary score, physical activity, shift work, exposure to high temperatures, occupational stress, and carbon monoxide exposure as key factors in the development of obesity in steelworkers. Test results indicated accuracies of 0.819, 0.868, and 0.872 for XG Boost, SVM, and RF respectively. Precision rates were 0.571, 0.696, and 0.765, while recall rates were 0.333, 0.592, and 0.481. The models achieved AUCs of 0.849, 0.908, and 0.912, with Brier scores of 0.128, 0.105, and 0.104, log losses of 0.409, 0.349, and 0.345, and calibration-in-the-large of 0.058, 0.054, and 0.051, respectively. Among these, the Random Forest model demonstrated superior performance. Conclusions The research indicates that obesity in steelworkers results from a combination of occupational and lifestyle factors. Of the models tested, the Random Forest model exhibited superior predictive ability, highlighting its significant practical application.
Keywords