Computational and Structural Biotechnology Journal (Jan 2023)
Infants’ gut microbiome data: A Bayesian Marginal Zero-inflated Negative Binomial regression model for multivariate analyses of count data
Abstract
The infants' gut microbiome is dynamic in nature. Literature has shown high inter-individual variability of gut microbial composition in the early years of infancy compared to adulthood. Although next-generation sequencing technologies are rapidly evolving, several statistical analysis aspects need to be addressed to capture the variability and dynamic nature of the infants' gut microbiome. In this study, we proposed a Bayesian Marginal Zero-inflated Negative Binomial (BAMZINB) model, addressing complexities associated with zero-inflation and multivariate structure of the infants' gut microbiome data. Here, we simulated 32 scenarios to compare the performance of BAMZINB with glmFit and BhGLM as the two other widely similar methods in the literature in handling zero-inflation, over-dispersion, and multivariate structure of the infants' gut microbiome. Then, we showed the performance of the BAMZINB approach on a real dataset using SKOT cohort (I and II) studies. Our simulation results showed that the BAMZINB model performed as well as those two methods in estimating the average abundance difference and had a better fit for almost all scenarios when the signal and sample size were large. Applying BAMZINB on SKOT cohorts showed remarkable changes in the average absolute abundance of specific bacteria from 9 to 18 months for infants of healthy and obese mothers. In conclusion, we recommend using the BAMZINB approach for infants' gut microbiome data taking zero-inflation and over-dispersion properties into account in multivariate analysis when comparing the average abundance difference.