Journal of Translational Medicine (Jan 2022)
Ground truth labels challenge the validity of sepsis consensus definitions in critical illness
Abstract
Abstract Background Sepsis is the leading cause of death in the intensive care unit (ICU). Expediting its diagnosis, largely determined by clinical assessment, improves survival. Predictive and explanatory modelling of sepsis in the critically ill commonly bases both outcome definition and predictions on clinical criteria for consensus definitions of sepsis, leading to circularity. As a remedy, we collected ground truth labels for sepsis. Methods In the Ground Truth for Sepsis Questionnaire (GTSQ), senior attending physicians in the ICU documented daily their opinion on each patient’s condition regarding sepsis as a five-category working diagnosis and nine related items. Working diagnosis groups were described and compared and their SOFA-scores analyzed with a generalized linear mixed model. Agreement and discriminatory performance measures for clinical criteria of sepsis and GTSQ labels as reference class were derived. Results We analyzed 7291 questionnaires and 761 complete encounters from the first survey year. Editing rates for all items were > 90%, and responses were consistent with current understanding of critical illness pathophysiology, including sepsis pathogenesis. Interrater agreement for presence and absence of sepsis was almost perfect but only slight for suspected infection. ICU mortality was 19.5% in encounters with SIRS as the “worst” working diagnosis compared to 5.9% with sepsis and 5.9% with severe sepsis without differences in admission and maximum SOFA. Compared to sepsis, proportions of GTSQs with SIRS plus acute organ dysfunction were equal and macrocirculatory abnormalities higher (p < 0.0001). SIRS proportionally ranked above sepsis in daily assessment of illness severity (p < 0.0001). Separate analyses of neurosurgical referrals revealed similar differences. Discriminatory performance of Sepsis-1/2 and Sepsis-3 compared to GTSQ labels was similar with sensitivities around 70% and specificities 92%. Essentially no difference between the prevalence of SIRS and SOFA ≥ 2 yielded sensitivities and specificities for detecting sepsis onset close to 55% and 83%, respectively. Conclusions GTSQ labels are a valid measure of sepsis in the ICU. They reveal suspicion of infection as an unclear clinical concept and refute an illness severity hierarchy in the SIRS-sepsis-severe sepsis spectrum. Ground truth challenges the accuracy of Sepsis-1/2 and Sepsis-3 in detecting sepsis onset. It is an indispensable intermediate step towards advancing diagnosis and therapy in the ICU and, potentially, other health care settings.
Keywords