Land (Apr 2022)

Machine Learning Techniques for Estimating Hydraulic Properties of the Topsoil across the Zambezi River Basin

  • Mulenga Kalumba,
  • Edwin Nyirenda,
  • Imasiku Nyambe,
  • Stefaan Dondeyne,
  • Jos Van Orshoven

DOI
https://doi.org/10.3390/land11040591
Journal volume & issue
Vol. 11, no. 4
p. 591

Abstract

Read online

It is critical to produce more crop per drop in an environment where water availability is decreasing and competition for water is increasing. In order to build such agricultural production systems, well parameterized crop growth models are essential. While in most crop growth modeling research, focus is on gathering model inputs such as climate data, less emphasis is paid to collecting the critical soil hydraulic properties (SHPs) data needed to operate crop growth models. Collection of SHPs data for the Zambezi River Basin (ZRB) is extremely labor-intensive and expensive, thus alternate technologies such as digital soil mapping (DSM) must be explored. We evaluated five types of DSM models to establish the best spatially explicit estimates of the soil water content at pF0.0 (saturation), pF2.0 (field capacity), and pF4.2 (wilting point), and of the saturated hydraulic conductivity (Ksat) across the ZRB by using estimates of locally calibrated pedotransfer functions of 1481 locations for training and testing the DSM models, as well as a reference dataset of measurements from 174 locations for validating the DSM models. We produced coverages of environmental covariates from various source datasets, including climate variables, soil and land use maps, parent materials and lithologic units, derivatives of a digital elevation model (DEM), and Landsat imagery with a spatial resolution of 90 m. The five types of models included multiple linear regression and four machine learning techniques: artificial neural network, gradient boosted regression trees, random forest, and support vector machine. Where the residuals of the initial DSM models were spatially autocorrelated, the models were extended/complemented with residual kriging (RK). Spatial autocorrelation in the model residuals was observed for all five models of each of the three water contents, but not for Ksat. On average for the water content, the R2 ranged from 0.40 to 0.80 in training and test datasets before adding kriged model residuals and ranged from 0.80 to 0.95 after adding model residuals. Overall, the best prediction method consisted of random forest as the deterministic model, complemented with RK, whereby soil texture followed by climate and topographic elevation variables were the most important covariates. The resulting maps are a ready-to-use resource for hydrologists and crop modelers to aliment and calibrate their hydrological and crop growth models.

Keywords