PLoS ONE (Jan 2019)

Using machine learning models to predict oxygen saturation following ventilator support adjustment in critically ill children: A single center pilot study.

  • Sam Ghazal,
  • Michael Sauthier,
  • David Brossier,
  • Wassim Bouachir,
  • Philippe A Jouvet,
  • Rita Noumeir

DOI
https://doi.org/10.1371/journal.pone.0198921
Journal volume & issue
Vol. 14, no. 2
p. e0198921

Abstract

Read online

BackgroundIn an intensive care units, experts in mechanical ventilation are not continuously at patient's bedside to adjust ventilation settings and to analyze the impact of these adjustments on gas exchange. The development of clinical decision support systems analyzing patients' data in real time offers an opportunity to fill this gap.ObjectiveThe objective of this study was to determine whether a machine learning predictive model could be trained on a set of clinical data and used to predict transcutaneous hemoglobin oxygen saturation 5 min (5min SpO2) after a ventilator setting change.Data sourcesData of mechanically ventilated children admitted between May 2015 and April 2017 were included and extracted from a high-resolution research database. More than 776,727 data rows were obtained from 610 patients, discretized into 3 class labels (Performance metrics of predictive modelsDue to data imbalance, four different data balancing processes were applied. Then, two machine learning models (artificial neural network and Bootstrap aggregation of complex decision trees) were trained and tested on these four different balanced datasets. The best model predicted SpO2 with area under the curves ConclusionThis single center pilot study using machine learning predictive model resulted in an algorithm with poor accuracy. The comparison of machine learning models showed that bagged complex trees was a promising approach. However, there is a need to improve these models before incorporating them into a clinical decision support systems. One potentially solution for improving predictive model, would be to increase the amount of data available to limit over-fitting that is potentially one of the cause for poor classification performances for 2 of the three class labels.