Journal of King Saud University: Science (Jun 2022)

Zero-inflated and hurdle models with an application to the number of involved axillary lymph nodes in primary breast cancer

  • Madiha Liaqat,
  • Shahid Kamal,
  • Florian Fischer,
  • Nadeem Zia

Journal volume & issue
Vol. 34, no. 4
p. 101932

Abstract

Read online

Objectives: This study aims to explore factors influencing the number of axillary lymph nodes in women diagnosed with primary breast cancer by choosing an efficient model to assess excess of zeros and overdispersion presented in the study population. Methods: It is based on a retrospective analysis of hospital records among 5196 female breast cancer patients in Pakistan. Zero-inflated and hurdle modelling techniques are used to assess the association between under-study factors and the number of involved lymph nodes in breast cancer patients. Count data models including Poisson and negative binomial, zero-inflated models (zero-inflated Poisson and zero-inflated negative binomial), and hurdle models (hurdle Poisson and hurdle negative binomial) were applied. Performance evaluation of models was compared based on AIC, BIC, and zero counts capturing. Results: The zero-inflated negative binomial model provided an acceptable fit. Findings indicate women who had a larger tumor in size suffered from the greater number of axillary involved lymph nodes from high-risk patients' group, also tumor grades II and III contributed to higher numbers of lymph nodes. Women’s ages do not have any significant influence on nodal status. Conclusions: Our analysis showed that the zero-inflated negative binomial is the best model for predicting and describing the number of involved nodes in primary breast cancer when overdispersion arises due to a large number of patients with no lymph node involvement. This is important for accurate prediction both for therapy and prognosis of breast cancer patients.

Keywords