Geoscientific Model Development (May 2019)

Bayesian inference and predictive performance of soil respiration models in the presence of model discrepancy

  • A. S. Elshall,
  • A. S. Elshall,
  • M. Ye,
  • G.-Y. Niu,
  • G.-Y. Niu,
  • G. A. Barron-Gafford,
  • G. A. Barron-Gafford

DOI
https://doi.org/10.5194/gmd-12-2009-2019
Journal volume & issue
Vol. 12
pp. 2009 – 2032

Abstract

Read online

Bayesian inference of microbial soil respiration models is often based on the assumptions that the residuals are independent (i.e., no temporal or spatial correlation), identically distributed (i.e., Gaussian noise), and have constant variance (i.e., homoscedastic). In the presence of model discrepancy, as no model is perfect, this study shows that these assumptions are generally invalid in soil respiration modeling such that residuals have high temporal correlation, an increasing variance with increasing magnitude of CO2 efflux, and non-Gaussian distribution. Relaxing these three assumptions stepwise results in eight data models. Data models are the basis of formulating likelihood functions of Bayesian inference. This study presents a systematic and comprehensive investigation of the impacts of data model selection on Bayesian inference and predictive performance. We use three mechanistic soil respiration models with different levels of model fidelity (i.e., model discrepancy) with respect to the number of carbon pools and the explicit representations of soil moisture controls on carbon degradation; therefore, we have different levels of model complexity with respect to the number of model parameters. The study shows that data models have substantial impacts on Bayesian inference and predictive performance of the soil respiration models such that the following points are true: (i) the level of complexity of the best model is generally justified by the cross-validation results for different data models; (ii) not accounting for heteroscedasticity and autocorrelation might not necessarily result in biased parameter estimates or predictions, but will definitely underestimate uncertainty; (iii) using a non-Gaussian data model improves the parameter estimates and the predictive performance; and (iv) accounting for autocorrelation only or joint inversion of correlation and heteroscedasticity can be problematic and requires special treatment. Although the conclusions of this study are empirical, the analysis may provide insights for selecting appropriate data models for soil respiration modeling.