BMC Medical Informatics and Decision Making (Sep 2021)

Modeling count data for health care utilization: an empirical study of outpatient visits among Vietnamese older people

  • Duc Dung Le,
  • Roberto Leon Gonzalez,
  • Joseph Upile Matola

DOI
https://doi.org/10.1186/s12911-021-01619-2
Journal volume & issue
Vol. 21, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background Vietnam is undergoing a fast-aging process that poses potential critical issues for older people and central among those is demand for healthcare utilization. However, healthcare utilization, here measured as count data, creates challenges for modeling because such data typically has distributions that are skewed with a large mass at zero. This study compares empirical econometric strategies for the modeling of healthcare utilization (measured as the number of outpatient visits in the last 12 months) and identifies the determinants of healthcare utilization among Vietnamese older people based on the best-fitting model identified. Methods Using the Vietnam Household Living Standard Survey in 2006 (N = 2426), nine econometric regression models for count data were examined to identify the best-fitting one. We used model selection criteria, statistical tests and goodness-of-fit for in-sample model selection. In addition, we conducted 10-fold cross-validation checks to examine reliability of the in-sample model selection. Finally, we utilized marginal effects to identify the factors associated with the number of outpatient visits among Vietnamese older people based on the best-fitting model identified. Results We found strong evidence in favor of hurdle negative binomial model 2 (HNB2) for both in-sample selection and 10-fold cross-validation checks. The marginal effect results of the HNB2 showed that ethnicity, region, household size, health insurance, smoking status, non-communicable diseases, and disability were significantly associated with the number of outpatient visits. The predicted probabilities for each count event revealed the distinct trends of healthcare utilization among specific groups: at low count events, women and people in the younger age group used more healthcare utilization than did men and their counterparts in older age groups, but a reverse trend was found at higher count events. Conclusions The high degree of skewness and dispersion that typically characterizes healthcare utilization data affects the appropriateness of the econometric models that should be used in modeling such data. In the case of Vietnamese older people, our study findings suggest that hurdle negative binomial models should be used in the modeling of healthcare utilization given that the data-generating process reflects two different decision-making processes.

Keywords