Applied and Environmental Soil Science (Jan 2014)

Comparison of Three Supervised Learning Methods for Digital Soil Mapping: Application to a Complex Terrain in the Ecuadorian Andes

  • Martin Hitziger,
  • Mareike Ließ

DOI
https://doi.org/10.1155/2014/809495
Journal volume & issue
Vol. 2014

Abstract

Read online

A digital soil mapping approach is applied to a complex, mountainous terrain in the Ecuadorian Andes. Relief features are derived from a digital elevation model and used as predictors for topsoil texture classes sand, silt, and clay. The performance of three statistical learning methods is compared: linear regression, random forest, and stochastic gradient boosting of regression trees. In linear regression, a stepwise backward variable selection procedure is applied and overfitting is controlled by minimizing Mallow’s Cp. For random forest and boosting, the effect of predictor selection and tuning procedures is assessed. 100-fold repetitions of a 5-fold cross-validation of the selected modelling procedures are employed for validation, uncertainty assessment, and method comparison. Absolute assessment of model performance is achieved by comparing the prediction error of the selected method and the mean. Boosting performs best, providing predictions that are reliably better than the mean. The median reduction of the root mean square error is around 5%. Elevation is the most important predictor. All models clearly distinguish ridges and slopes. The predicted texture patterns are interpreted as result of catena sequences (eluviation of fine particles on slope shoulders) and landslides (mixing up mineral soil horizons on slopes).