Heliyon (Apr 2024)
Identification and validation of sepsis subphenotypes using time-series data
Abstract
Purpose: The recognition of sepsis as a heterogeneous syndrome necessitates identifying distinct subphenotypes to select targeted treatment. Methods: Patients with sepsis from the MIMIC-IV database (2008–2019) were randomly divided into a development cohort (80%) and an internal validation cohort (20%). Patients with sepsis from the ICU database of Peking University People's Hospital (2008–2022) were included in the external validation cohort. Time-series k-means clustering analysis and dynamic time warping was performed to develop and validate sepsis subphenotypes by analyzing the trends of 21 vital signs and laboratory indicators within 24 h after sepsis onset. Inflammatory biomarkers were compared in the ICU database of Peking University People's Hospital, whereas treatment heterogeneity was compared in the MIMIC-IV database. Findings: Three sub-phenotypes were identified in the development cohort. Type A patients (N = 2525, 47%) exhibited stable vital signs and fair organ function, type B (N = 1552, 29%) was exhibited an obvious inflammatory response and stable organ function, and type C (N = 1251, 24%) exhibited severely impaired organ function with a deteriorating tendency. Type C demonstrated the highest mortality rate (33%) and levels of inflammatory biomarkers, followed by type B (24%), whereas type A exhibited the lowest mortality rate (11%) and levels of inflammatory biomarkers. These subphenotypes were confirmed in both the internal and external cohorts, demonstrating similar features and comparable mortality rates. In type C patients, survivors had significantly lower fluid intake within 24 h after sepsis onset (median 2891 mL, interquartile range (IQR) 1530–5470 mL) than that in non-survivors (median 4342 mL, IQR 2189–7305 mL). For types B and C, survivors showed a higher proportion of indwelling central venous catheters (p < 0.05). Conclusion: Three novel phenotypes of patients with sepsis were identified and validated using time-series data, revealing significant heterogeneity in inflammatory biomarkers, treatments, and consistency across cohorts.