Scandinavian Journal of Primary Health Care (Oct 2024)

Predictive modeling for identification of older adults with high utilization of health and social services

  • Heba Sourkatti,
  • Juha Pajula,
  • Teemu Keski-Kuha,
  • Juha Koivisto,
  • Mika Hilvo,
  • Jaakko Lähteenmäki

DOI
https://doi.org/10.1080/02813432.2024.2372297
Journal volume & issue
Vol. 42, no. 4
pp. 609 – 616

Abstract

Read online

Aim Machine learning techniques have demonstrated success in predictive modeling across various clinical cases. However, few studies have considered predicting the use of multisectoral health and social services among older adults. This research aims to utilize machine learning models to detect high-risk groups of excessive health and social services utilization at early stage, facilitating the implementation of preventive interventions.Methods We used pseudonymized data covering a four-year period and including information on a total of 33,374 senior citizens from Southern Finland. The endpoint was defined based on the occurrence of unplanned healthcare visits and the total number of different services used. Input features included individual’s basic demographics, health status and past usage of healthcare resources. Logistic regression and eXtreme Gradient Boosting (XGBoost) methods were used for binary classification, with the dataset split into 70% training and 30% testing sets.Results Subgroup-based results mirrored trends observed in the full cohort, with age and certain health issues, e.g. mental health, emerging as positive predictors for high service utilization. Conversely, hospital stay and urban residence were associated with decreased risk. The models achieved a classification performance (AUC) of 0.61 for the full cohort and varying in the range of 0.55–0.62 for the subgroups.Conclusions Predictive models offer potential for predicting future high service utilization in the older adult population. Achieving high classification performance remains challenging due to diverse contributing factors. We anticipate that classification performance could be increased by including features based on additional data categories such as socio-economic data.

Keywords