陆军军医大学学报 (Jan 2025)

Construction and validation of a risk prediction model for high altitude de-acclimatization syndrome

  • DING Yu,
  • DING Yu,
  • WANG Zejun,
  • WANG Zejun,
  • XIE Jiaxin,
  • XIE Jiaxin

DOI
https://doi.org/10.16016/j.2097-0927.202408110
Journal volume & issue
Vol. 47, no. 1
pp. 20 – 29

Abstract

Read online

Objective‍ ‍To construct risk models for predicting the occurrence of high altitude de-acclimatization syndrome (HADAS) in the population returning from the plateau to the plain based on different machine learning algorithms and validate the predicting efficiency of these models. Methods‍ ‍Field or online surveys were conducted on the individuals who had ended their high-altitude living and returned to the plain areas from November 2020 to February 2024. Basic information, chronic mountain sickness (CMS), HADAS symptoms and other data were collected. With the inclusion and exclusion criteria, totally 1 095 individuals were subjected and assigned into the modeling group. Positive events were defined as HADAS score >5. Then the modelling group was divided into a training set (n=766) and an internal test set (n=329) in a 7∶3 ratio. Least absolute shrinkage and selection operator (LASSO) regression was used to select independent variables. Risk prediction models for high-altitude adaptation symptoms were built based on 8 machine learning methods, including multiple factor logistic regression (LR), decision tree (DT), random forest (RF), eXtreme gradient boosting (XGB), support vector machine (SVM), K-nearest neighbor (KNN), light gradient boosting (LGB) and naïve bayes (NB). The models were compared and evaluated using receiver operating characteristic (ROC) curves, calibration curves and confusion matrices in the internal test set. The final model was presented using a nomogram or Shapley additive explanations (SHAP) algorithm. In August 2024, another 132 individuals who returned to the plains and met the same criteria were recruited and served as the external validation group. Results‍ ‍There were 549 individuals (50.14%) out of the 1 095 subjects having HADAS symptoms. LASSO regression identified CMS score, age and duration of high-altitude residence as significant predictors. Among the 8 machine learning algorithms, the LR model was identified as the best, with an area under the curve (AUC) value of 0.819 (95%CI: 0.789~0.850) and 0.841 (95%CI: 0.799~0.884), and an F1 score of 0.801 in the internal test set, respectively, and the AUC value and F1 score of the LR model were the largest among the 8 models in the internal test set. Spiegelhalter Z test of the calibration curve of the LR model indicated that its P=0.703 in the training set while P=0.281 in the internal test set. The AUC value of the LR model was 0.867(95%CI: 0.765~0.969) in the external validation set. Conclusion‍ ‍The LR model constructed based on indicators including CMS score, age and duration of high-altitude residence has a good overall performance in the internal test set, and good discriminating effect in the external validation set. The constructed nomogram is convenient for application.

Keywords