Archives of Academic Emergency Medicine (Nov 2023)
Zero-Inflated Count Regression Models in Solving Challenges Posed by Outlier-Prone Data; an Application to Length of Hospital Stay
Abstract
Introduction: Ignoring outliers in data may lead to misleading results. Length of stay (LOS) is often considered a count variable with a high frequency of outliers. This study exemplifies the potential of robust methodologies in enhancing the accuracy and reliability of analyses conducted on skewed and outlier-prone count data of LOS. Methods: The application of Zero-Inflated Poisson (ZIP) and robust Zero-Inflated Poisson (RZIP) models in solving challenges posed by outlier LOS data were evaluated. The ZIP model incorporates two components, tackling excess zeros with a zero-inflation component and modeling positive counts with a Poisson component. The RZIP model introduces the Robust Expectation-Solution (RES) algorithm to enhance parameter estimation and address the impact of outliers on the model's performance. Results: Data from 254 intensive care unit patients were analyzed (62.2% male). Patients aged 65 or older accounted for 58.3% of the sample. Notably, 38.6% of patients exhibited zero LOS. The overall mean LOS was 5.89 (± 9.81) days, and 9.45% of cases displayed outliers. Our analysis using the RZIP model revealed significant predictors of LOS, including age, underlying comorbidities (p<0.001), and insurance status (p=0.013). Model comparison demonstrated the RZIP model's superiority over ZIP, as evidenced by lower Akaike information criteria (AIC) and Bayesians information criteria (BIC) values. Conclusions: The application of the RZIP model allowed us to uncover meaningful insights into the factors influencing LOS, paving the way for more informed decision-making in hospital management.
Keywords