Ecology and Evolution (Mar 2019)
A hierarchical Bayesian approach for handling missing classification data
Abstract
Abstract Ecologists use classifications of individuals in categories to understand composition of populations and communities. These categories might be defined by demographics, functional traits, or species. Assignment of categories is often imperfect, but frequently treated as observations without error. When individuals are observed but not classified, these “partial” observations must be modified to include the missing data mechanism to avoid spurious inference. We developed two hierarchical Bayesian models to overcome the assumption of perfect assignment to mutually exclusive categories in the multinomial distribution of categorical counts, when classifications are missing. These models incorporate auxiliary information to adjust the posterior distributions of the proportions of membership in categories. In one model, we use an empirical Bayes approach, where a subset of data from one year serves as a prior for the missing data the next. In the other approach, we use a small random sample of data within a year to inform the distribution of the missing data. We performed a simulation to show the bias that occurs when partial observations were ignored and demonstrated the altered inference for the estimation of demographic ratios. We applied our models to demographic classifications of elk (Cervus elaphus nelsoni) to demonstrate improved inference for the proportions of sex and stage classes. We developed multiple modeling approaches using a generalizable nested multinomial structure to account for partially observed data that were missing not at random for classification counts. Accounting for classification uncertainty is important to accurately understand the composition of populations and communities in ecological studies.
Keywords