Revista Brasileira de Ciência do Solo (Apr 2018)

Prediction of Topsoil Texture Through Regression Trees and Multiple Linear Regressions

  • Helena Saraiva Koenow Pinheiro,
  • Waldir de Carvalho Junior,
  • César da Silva Chagas,
  • Lúcia Helena Cunha dos Anjos,
  • Phillip Ray Owens

DOI
https://doi.org/10.1590/18069657rbcs20170167
Journal volume & issue
Vol. 42, no. 0

Abstract

Read online

ABSTRACT: Users of soil survey products are mostly interested in understanding how soil properties vary in space and time. The aim of digital soil mapping (DSM) is to represent the spatial variability of soil properties quantitatively to support decision-making. The goal of this study is to evaluate DSM techniques (Regression Trees - RT and Multiple Linear Regressions - MLR) and the ability of these tools to predict mineral fraction content under a wide variability of landscapes. The study site was the entire Guapi-Macacu watershed (1,250.78 km2) in the state of Rio de Janeiro in the Southeast region of Brazil. Terrain attributes and remote sensing data (with 30 m of spatial resolution) were used to represent landscape co-variables selected as an input in predictive models in order to develop the explanatory variables. The selection of sampling sites was based on the Latin Hypercube algorithm. A representative set of one hundred points with feasible field access was chosen. Different input databases were tested for prediction of mineral fraction content (harmonized and original data). The Spline algorithm was used to harmonize data according to the GlobalSoil. Net consortium standards. The results showed better performance from the RT models, using input from an average of six covariates; the simplest MLR model used twice as many input variables, creating more complex models without gaining precision. Furthermore, better R2 values were obtained using RT models, irrespective of harmonization of soil data. The harmonized dataset from the 0.00-0.05 and 0.05-0.15 m layers, in general, presented better results for the clay and silt, with R2 values of 0.52 (0.00-0.05 m) and 0.69 (0.05-0.15 m), respectively. Prediction of sand content showed better results when the original depth data was used as an input, although all regression tree models had R2 values greater than 0.52. The RT models provided a better statistical index than MLR for all predicted properties; however, the variance between models suggests similarity of performance. Regarding harmonization of soil data, both input databases (harmonized or not) can be used to predict soil properties, since the variance of model performance was low and generalization of the soil maps showed similar trends. The products obtained from the digital soil mapping approach make it possible to integrate the factor of uncertainties, providing easier interpretation for soil management and land use decisions.

Keywords