Frontiers in Digital Health (Feb 2023)

Learning meaningful latent space representations for patient risk stratification: Model development and validation for dengue and other acute febrile illness

  • Bernard Hernandez,
  • Bernard Hernandez,
  • Oliver Stiff,
  • Damien K. Ming,
  • Damien K. Ming,
  • Chanh Ho Quang,
  • Vuong Nguyen Lam,
  • Vuong Nguyen Lam,
  • Tuan Nguyen Minh,
  • Chau Nguyen Van Vinh,
  • Chau Nguyen Van Vinh,
  • Nguyet Nguyen Minh,
  • Huy Nguyen Quang,
  • Lam Phung Khanh,
  • Lam Phung Khanh,
  • Tam Dong Thi Hoai,
  • Trung Dinh The,
  • Trieu Huynh Trung,
  • Trieu Huynh Trung,
  • Bridget Wills,
  • Bridget Wills,
  • Cameron P. Simmons,
  • Alison H. Holmes,
  • Alison H. Holmes,
  • Sophie Yacoub,
  • Sophie Yacoub,
  • Pantelis Georgiou,
  • Pantelis Georgiou,
  • on behalf of the Vietnam ICU Translational Applications Laboratory (VITAL) investigators

DOI
https://doi.org/10.3389/fdgth.2023.1057467
Journal volume & issue
Vol. 5

Abstract

Read online

BackgroundIncreased data availability has prompted the creation of clinical decision support systems. These systems utilise clinical information to enhance health care provision, both to predict the likelihood of specific clinical outcomes or evaluate the risk of further complications. However, their adoption remains low due to concerns regarding the quality of recommendations, and a lack of clarity on how results are best obtained and presented.MethodsWe used autoencoders capable of reducing the dimensionality of complex datasets in order to produce a 2D representation denoted as latent space to support understanding of complex clinical data. In this output, meaningful representations of individual patient profiles are spatially mapped in an unsupervised manner according to their input clinical parameters. This technique was then applied to a large real-world clinical dataset of over 12,000 patients with an illness compatible with dengue infection in Ho Chi Minh City, Vietnam between 1999 and 2021. Dengue is a systemic viral disease which exerts significant health and economic burden worldwide, and up to 5% of hospitalised patients develop life-threatening complications.ResultsThe latent space produced by the selected autoencoder aligns with established clinical characteristics exhibited by patients with dengue infection, as well as features of disease progression. Similar clinical phenotypes are represented close to each other in the latent space and clustered according to outcomes broadly described by the World Health Organisation dengue guidelines. Balancing distance metrics and density metrics produced results covering most of the latent space, and improved visualisation whilst preserving utility, with similar patients grouped closer together. In this case, this balance is achieved by using the sigmoid activation function and one hidden layer with three neurons, in addition to the latent dimension layer, which produces the output (Pearson, 0.840; Spearman, 0.830; Procrustes, 0.301; GMM 0.321).ConclusionThis study demonstrates that when adequately configured, autoencoders can produce two-dimensional representations of a complex dataset that conserve the distance relationship between points. The output visualisation groups patients with clinically relevant features closely together and inherently supports user interpretability. Work is underway to incorporate these findings into an electronic clinical decision support system to guide individual patient management.

Keywords