Scientific Reports (Oct 2022)

Clustering of trauma patients based on longitudinal data and the application of machine learning to predict recovery

  • Kostas Stoitsas,
  • Saurabh Bahulikar,
  • Leonie de Munter,
  • Mariska A. C. de Jongh,
  • Maria A. C. Jansen,
  • Merel M. Jung,
  • Marijn van Wingerden,
  • Katrijn Van Deun

DOI
https://doi.org/10.1038/s41598-022-21390-2
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Predicting recovery after trauma is important to provide patients a perspective on their estimated future health, to engage in shared decision making and target interventions to relevant patient groups. In the present study, several unsupervised techniques are employed to cluster patients based on longitudinal recovery profiles. Subsequently, these data-driven clusters were assessed on clinical validity by experts and used as targets in supervised machine learning models. We present a formalised analysis of the obtained clusters that incorporates evaluation of (i) statistical and machine learning metrics, (ii) clusters clinical validity with descriptive statistics and medical expertise. Clusters quality assessment revealed that clusters obtained through a Bayesian method (High Dimensional Supervised Classification and Clustering) and a Deep Gaussian Mixture model, in combination with oversampling and a Random Forest for supervised learning of the cluster assignments provided among the most clinically sensible partitioning of patients. Other methods that obtained higher classification accuracy suffered from cluster solutions with large majority classes or clinically less sensible classes. Models that used just physical or a mix of physical and psychological outcomes proved to be among the most sensible, suggesting that clustering on psychological outcomes alone yields recovery profiles that do not conform to known risk factors.