Axioms (Feb 2024)
A Statistical Model for Count Data Analysis and Population Size Estimation: Introducing a Mixed Poisson–Lindley Distribution and Its Zero Truncation
Abstract
Count data consists of both observed and unobserved events. The analysis of count data often encounters overdispersion, where traditional Poisson models may not be adequate. In this paper, we introduce a tractable one-parameter mixed Poisson distribution, which combines the Poisson distribution with the improved second-degree Lindley distribution. This distribution, called the Poisson-improved second-degree Lindley distribution, is capable of effectively modeling standard count data with overdispersion. However, if the frequency of the unobserved events is unknown, the proposed distribution cannot be directly used to describe the events. To address this limitation, we propose a modification by truncating the distribution to zero. This results in a tractable zero-truncated distribution that encompasses all types of dispersions. Due to the unknown frequency of unobserved events, the population size as a whole becomes unknown and requires estimation. To estimate the population size, we develop a Horvitz–Thompson-like estimator utilizing truncated distribution. Both the untruncated and truncated distributions exhibit desirable statistical properties. The estimators for both distributions, as well as the population size, are asymptotically unbiased and consistent. The current study demonstrates that both the truncated and untruncated distributions adequately explain the considered medical datasets, which are the number of dicentric chromosomes after being exposed to different doses of radiation and the number of positive Salmonella. Moreover, the proposed population size estimator yields reliable estimates.
Keywords