ClinicoEconomics and Outcomes Research (Nov 2022)

Surgical Complication Risk Factor Identification Using High-Dimensional Hospital Data: An Illustrative Example in Hemostasis-Related Complications

  • Johnston S,
  • Jha A,
  • Roy S,
  • Pollack E

Journal volume & issue
Vol. Volume 14
pp. 683 – 689

Abstract

Read online

Stephen Johnston,1 Aakash Jha,2 Sanjoy Roy,3 Esther Pollack3 1MedTech Epidemiology and Real-World Data Sciences; Johnson & Johnson, New Brunswick, NJ, USA; 2Decision Science, Mu Sigma, Bangalore, India; 3Ethicon, Cincinnati, OH, USACorrespondence: Stephen Johnston, MedTech Epidemiology and Real-World Data Sciences, Johnson & Johnson, 410 George Street, New Brunswick, NJ, 08901, USA, Tel +1-443-254-2222, Email [email protected]: To describe an approach wherein high-dimensional hospital data can be used to identify generalizable risk factors for surgical complications for which there may be limited prior knowledge, as illustrated in the context of hemostasis-related complications (HRC).Patients and Methods: This was a retrospective study of the Premier Healthcare Database. Patients included for the study underwent video-assisted thoracoscopic lobectomy (VATL), laparoscopic right colectomy (LRC), or laparoscopic sleeve gastrectomy (LSG) on an inpatient setting between Oct-2015 and Feb-2020 (first = index). The outcome, HRC, comprised hemorrhage, control of bleeding, and acute posthemorrhagic anemia. For each cohort, a high-dimensional dataset (ie, comprising 1000s of candidate risk factors) was constructed using taxonomies from the Clinical Classification Software Refined (CCSR). Candidate risk factors were fed into logistic regression models with a 70%/30% train/test split for each cohort; clinically plausible risk factors that were consistently significant predictors of HRC across the 3 training models were then used in a final parsimonious model including sex, age, race, and payor; finally, the parsimonious model was applied to the test data to compare predicted risk with observed incidence of HRSC.Results: The study included 11,141 VATL, 20,156 LRC, and 121,547 LSG patients, in whom 7.5%, 7.8%, and 1.2% experienced HRSC, respectively. Ultimately, 6 clinically plausible CCSR categories were identified as being statistically significant predictors across all 3 cohorts (eg, coagulation and hemorrhagic disorders, malnutrition, alcohol-related disorders, among others). In the parsimonious model applied to the test data, the observed incidence of HRSC was substantially higher in the top quintile vs bottom quintile of predicted risk: LSG 2.05% vs 0.53%, LRC 13.30% vs 4.11%, VATS 12.49% vs 5.04%.Conclusion: High-dimensional real-world data can be useful to identify risk factors for outcomes that generalize across multiple cohorts. The risk factors identified herein should be considered for inclusion in future studies of hemostasis-related complications.Keywords: hemostasis, high-dimensional, real-world data, surgery, complications

Keywords