BMC Public Health (Jun 2024)

How predictive of future healthcare utilisation and mortality is data-driven population segmentation based on healthcare utilisation and chronic condition comorbidity?

  • Andrea Gartner,
  • Rhian Daniel,
  • Ciarán Slyne,
  • Kelechi Ebere Nnoaham

DOI
https://doi.org/10.1186/s12889-024-19065-w
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background In recent years data-driven population segmentation using cluster analyses of mainly health care utilisation data has been used as a proxy of future health care need. Chronic conditions patterns tended to be examined after segmentation but may be useful as a segmentation variable which, in combination with utilisation could indicate severity. These could further be of practical use to target specific clinical groups including for prevention. This study aimed to assess the ability of data-driven segmentation based on health care utilisation and comorbidities to predict future outcomes: Emergency admission, A&E attendance, GP practice contacts, and mortality. Methods We analysed record-linked data for 412,997 patients registered with GP practices in 2018-19 in Cwm Taf Morgannwg University Health Board (CTM UHB) area within the Secure Anonymised Information Linkage (SAIL) Databank. We created 10 segments using k-means clustering based on utilisation (GP practice contacts, prescriptions, emergency and elective admissions, A&E and outpatients) and chronic condition counts for 2018 using different variable compositions to denote need. We assessed the characteristics of the segments. We employed a train/test scheme (80% training set) to compare logistic regression model predictions with observed outcomes on follow-up in 2019. We assessed the area under the ROC curve (AUC) for models with demographic variables, with and without the segments, as well as between segmentation implementations (with/without comorbidity and primary care data). Results Adding the segments to the model with demographic covariates improved the prediction for all outcomes. For emergency admissions this increased discrimination from AUC 0.65 (CI 0.64–0.65) to 0.73 (CI 0.73–0.74). Models with the segments only performed nearly as well as the full models. Excluding comorbidity showed reduced predictive ability for mortality (similar otherwise) but most pronounced reduction when excluding all primary care variables. Conclusions This shows that the segments have satisfactory predictive ability, even for varied outcomes and a broad range of events and conditions used in the segmentation. It suggests that the segments can be a useful tool in helping to identify specific groups of need to target with anticipatory care. Identification may be refined with selected diagnoses or more specialised tools such as risk stratification.

Keywords