Carbon Balance and Management (Dec 2018)

Selection criteria for linear regression models to estimate individual tree biomasses in the Atlantic Rain Forest, Brazil

  • Carlos Roberto Sanquetta,
  • Ana Paula Dalla Corte,
  • Alexandre Behling,
  • Luani Rosa de Oliveira Piva,
  • Sylvio Péllico Netto,
  • Aurélio Lourenço Rodrigues,
  • Mateus Niroh Inoue Sanquetta

DOI
https://doi.org/10.1186/s13021-018-0112-6
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Background Biomass models are useful for several purposes, especially for quantifying carbon stocks and dynamics in forests. Selecting appropriate equations from a fitted model is a process which can involves several criteria, some widely used and others used to a lesser extent. This study analyzes six selection criteria for models fitted to six sets of individual biomass collected from woody indigenous species of the Tropical Atlantic Rain Forest in Brazil. Six models were examined and the respective fitted equations evaluated by the residual sum of squares, adjusted coefficient of determination, absolute and relative estimates of the standard error of estimate, and Akaike and Schwartz (Bayesian) information criteria. The aim of this study was to analyze the numeric behavior of these model selection criteria and discuss the ease of interpretation of them. The importance of residual analysis in model selection is stressed. Results The adjusted coefficient of determination ($$ R^{2}_{adj.} $$ Radj.2 ) and the standard error of estimate in percentage (Syx%) are relative model selection criteria and are not affected by sample size and scale of the response variable. The sum of squared residuals (SSR), the absolute standard error of estimate (Syx), the Akaike information criterion and the Schwartz information criterion, in turn, depend on these quantities. The best fit model was always the same within a given data set regardless the model selection criteria considered (except for SSR in two cases), indicating they tend to converge to a common result. However, such criteria are not always closely related across different data sets. General model selection criteria are indicative of the average goodness of fit, but do not capture bias and outlier effects. Graphical residual analysis is a useful tool to this detection and must always be used in model selection. Conclusions It is concluded that the criteria for model selection tend to lead to a common result, regardless their mathematical formulation and statistical significance. Relative measures of goodness of fitting are easier to interpret than the absolute ones. Careful graphical residual analysis must always be used to confirm the performance of the models.

Keywords