Cell Journal (Aug 2023)

Regularized Machine Learning Models for Prediction of Metabolic Syndrome Using GCKR, APOA5, and BUD13 Gene Variants: Tehran Cardiometabolic Genetic Study

  • Nadia Alipour,
  • Anoshirvan Kazemnejad,
  • Mahdi Akbarzadeh,
  • Farzad Eskandari,
  • Asiyeh Sadat Zahedi,
  • Maryam S Daneshpour

DOI
https://doi.org/10.22074/cellj.2023.2000864.1294
Journal volume & issue
Vol. 25, no. 8
pp. 536 – 545

Abstract

Read online

Objective: Metabolic syndrome (MetS) is a complex multifactorial disorder that considerably burdens healthcaresystems. We aim to classify MetS using regularized machine learning models in the presence of the risk variants ofGCKR, BUD13 and APOA5, and environmental risk factors.Materials and Methods: A cohort study was conducted on 2,346 cases and 2,203 controls from eligible TehranCardiometabolic Genetic Study (TCGS) participants whose data were collected from 1999 to 2017. We used differentregularization approaches [least absolute shrinkage and selection operator (LASSO), ridge regression (RR), elasticnet(ENET), adaptive LASSO (aLASSO), and adaptive ENET (aENET)] and a classical logistic regression (LR) modelto classify MetS and select influential variables that predict MetS. Demographics, clinical features, and commonpolymorphisms in the GCKR, BUD13, and APOA5 genes of eligible participants were assessed to classify TCGSparticipant status in MetS development. The models’ performance was evaluated by 10-repeated 10-fold crossvalidation.Various assessment measures of sensitivity, specificity, classification accuracy, and area under the receiveroperating characteristic curve (AUC-ROC) and AUC-precision-recall (AUC-PR) curves were used to compare themodels.Results: During the follow-up period, 50.38% of participants developed MetS. The groups were not similar in terms ofbaseline characteristics and risk variants. MetS was significantly associated with age, gender, schooling years, bodymass index (BMI), and alternate alleles in all the risk variants, as indicated by LR. A comparison of accuracy, AUCROC,and AUC-PR metrics indicated that the regularization models outperformed LR. Regularized machine learningmodels provided comparable classification performances, whereas the aLASSO model was more parsimonious andselected fewer predictors.Conclusion: Regularized machine learning models provided more accurate and parsimonious MetS classifyingmodels. These high-performing diagnostic models can lay the foundation for clinical decision support tools that usegenetic and demographical variables to locate individuals at high risk for MetS.

Keywords