Comparative assessment of parameter estimation methods in the presence of overdispersion: a simulation study

Kimberlyn Roosa; Ruiyan Luo; Gerardo Chowell

doi:10.3934/mbe.2019214

Mathematical Biosciences and Engineering (May 2019)

Comparative assessment of parameter estimation methods in the presence of overdispersion: a simulation study

Kimberlyn Roosa,
Ruiyan Luo,
Gerardo Chowell

Affiliations

Kimberlyn Roosa: 1. Department of Population Health Sciences, School of Public Health, Georgia State University, Atlanta, GA, USA
Ruiyan Luo: 1. Department of Population Health Sciences, School of Public Health, Georgia State University, Atlanta, GA, USA
Gerardo Chowell: 1. Department of Population Health Sciences, School of Public Health, Georgia State University, Atlanta, GA, USA 2. Division of International Epidemiology and Population Studies, Fogarty International Center, National Institute of Health, Bethesda, MD, USA

DOI: https://doi.org/10.3934/mbe.2019214
Journal volume & issue: Vol. 16, no. 5
pp. 4299 – 4313

Abstract

Read online

The Poisson distribution is commonly assumed as the error structure for count data; however, empirical data may exhibit greater variability than expected based on a given statistical model. Greater variability could point to model misspecification, such as missing crucial information about the epidemiology of the disease or changes in population behavior. When the mechanism producing the apparent overdispersion is unknown, it is typically assumed that the variance in the data exceeds the mean (by some scaling factor). Thus, a probability distribution that allows for overdispersion (negative binomial, for example) may better represent the data. Here, we utilize simulation studies to assess how misspecifying the error structure affects parameter estimation results, specifically bias and uncertainty, as a function of the level of random noise in the data. We compare results for two parameter estimation methods: nonlinear least squares and maximum likelihood estimation with Poisson error structure. We analyze two phenomenological models the generalized growth model and generalized logistic growth model to assess how results of parameter estimation are affected by the level of overdispersion underlying in the data. We use simulation to obtain confidence intervals and mean squared error of parameter estimates. We also analyze the impact of the amount of data, or ascending phase length, on the results of the generalized growth model for increasing levels of overdispersion. The results show a clear pattern of increasing uncertainty, or confidence interval width, as the overdispersion in the data increases. While maximum likelihood estimation consistently yields narrower confidence intervals and smaller mean squared error, differences between the two methods were minimal and not practically significant. At moderate levels of overdispersion, both estimation methods yielded similar performance. Importantly, it is shown that issues of parameter uncertainty and bias in the presence of overdispersion can be mitigated with the inclusion of more data.

Published in Mathematical Biosciences and Engineering

ISSN: 1551-0018 (Online)
Publisher: AIMS Press
Country of publisher: United States
LCC subjects: Technology: Chemical technology: Biotechnology; Science: Mathematics
Website: https://www.aimspress.com/journal/MBE

About the journal

Abstract

Keywords