International Soil and Water Conservation Research (Mar 2025)

Visible, near-infrared, and shortwave-infrared spectra as an input variable for digital mapping of soil organic carbon

  • Vahid Khosravi,
  • Asa Gholizadeh,
  • Radka Kodešová,
  • Prince Chapman Agyeman,
  • Mohammadmehdi Saberioon,
  • Luboš Borůvka

Journal volume & issue
Vol. 13, no. 1
pp. 203 – 214

Abstract

Read online

This study proposes a novel methodology to employ discrete point spectra as input variable for digital mapping of soil organic carbon (SOC). Accordingly, two SOC modeling approaches were used in three agricultural sites in Czech Republic: i) machine learning (ML) including partial least squares regression (PLSR), cubist, random forest (RF), and support vector regression (SVR), and ii) regression kriging (RK) by the combination of ordinary kriging (OK) and PLSR (PLSR-K), cubist (cubist-K), RF (RF-K), and SVR (SVR-K). Models were developed on environmental predictor covariates (EPCs) and thirty genetic algorithms (GA)-selected visible, near-infrared, and shortwave-infrared (VNIR–SWIR) wavelengths spectra, individually and combined. Thirty rasters were then created using interpolation of the selected spectra and served as the input variables – with and without EPCs – to test and compare the developed models and SOC predictive maps with each other and with those retrieved from the third approach: iii) kriging using OK of the measured and ML-predicted SOC. The impact of employing selected wavelengths’ spectra and EPCs on models' performance was investigated using independent test samples and the uncertainty associated with the produced maps. Using interpolated spectra as the only input variable yielded a relatively acceptable accuracy (Nová Ves: RMSE = 0.19%, Údrnice: RMSE = 0.12%, Klučov: RMSE = 0.13%). In comparison, the interpolated spectra coupled with EPCs enhanced the results. Regarding the uncertainty, however, the ML-based SOC maps were more reliable, than RK-based ones. Furthermore, maps produced using both spectra and EPCs showed less uncertainty than those constructed on the individual datasets.

Keywords