Intensive Care Medicine Experimental (Feb 2023)

Precision of CT-derived alveolar recruitment assessed by human observers and a machine learning algorithm in moderate and severe ARDS

  • Ludmilla Penarrubia,
  • Aude Verstraete,
  • Maciej Orkisz,
  • Eduardo Davila,
  • Loic Boussel,
  • Hodane Yonis,
  • Mehdi Mezidi,
  • Francois Dhelft,
  • William Danjou,
  • Alwin Bazzani,
  • Florian Sigaud,
  • Sam Bayat,
  • Nicolas Terzi,
  • Mehdi Girard,
  • Laurent Bitker,
  • Emmanuel Roux,
  • Jean-Christophe Richard

DOI
https://doi.org/10.1186/s40635-023-00495-6
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Background Assessing measurement error in alveolar recruitment on computed tomography (CT) is of paramount importance to select a reliable threshold identifying patients with high potential for alveolar recruitment and to rationalize positive end-expiratory pressure (PEEP) setting in acute respiratory distress syndrome (ARDS). The aim of this study was to assess both intra- and inter-observer smallest real difference (SRD) exceeding measurement error of recruitment using both human and machine learning-made lung segmentation (i.e., delineation) on CT. This single-center observational study was performed on adult ARDS patients. CT were acquired at end-expiration and end-inspiration at the PEEP level selected by clinicians, and at end-expiration at PEEP 5 and 15 cmH2O. Two human observers and a machine learning algorithm performed lung segmentation. Recruitment was computed as the weight change of the non-aerated compartment on CT between PEEP 5 and 15 cmH2O. Results Thirteen patients were included, of whom 11 (85%) presented a severe ARDS. Intra- and inter-observer measurements of recruitment were virtually unbiased, with 95% confidence intervals (CI95%) encompassing zero. The intra-observer SRD of recruitment amounted to 3.5 [CI95% 2.4–5.2]% of lung weight. The human–human inter-observer SRD of recruitment was slightly higher amounting to 5.7 [CI95% 4.0–8.0]% of lung weight, as was the human–machine SRD (5.9 [CI95% 4.3–7.8]% of lung weight). Regarding other CT measurements, both intra-observer and inter-observer SRD were close to zero for the CT-measurements focusing on aerated lung (end-expiratory lung volume, hyperinflation), and higher for the CT-measurements relying on accurate segmentation of the non-aerated lung (lung weight, tidal recruitment…). The average symmetric surface distance between lung segmentation masks was significatively lower in intra-observer comparisons (0.8 mm [interquartile range (IQR) 0.6–0.9]) as compared to human–human (1.0 mm [IQR 0.8–1.3] and human–machine inter-observer comparisons (1.1 mm [IQR 0.9–1.3]). Conclusions The SRD exceeding intra-observer experimental error in the measurement of alveolar recruitment may be conservatively set to 5% (i.e., the upper value of the CI95%). Human–machine and human–human inter-observer measurement errors with CT are of similar magnitude, suggesting that machine learning segmentation algorithms are credible alternative to humans for quantifying alveolar recruitment on CT.

Keywords