European Journal of Remote Sensing (Jan 2021)
Comparisons of random forest and stochastic gradient treeboost algorithms for mapping soil electrical conductivity with multiple subsets using Landsat OLI and DEM/GIS-based data at a type oasis in Xinjiang, China
Abstract
Accurate assessment of the spatial distribution and severity of soil salinization has long plagued local governments and researchers in the arid parts of Xinjiang Uygur Autonomous Region (XJUAR). The emergence of machine learning has brought hope to this research field, such as Random Forest (RF) and Stochastic Gradient Treeboost (SGT),however, which are few applications to the quantitative assessment of soil salinization. Therefore, in order to evaluate the accuracy level of the two algorithms for predicting soil salinity, twenty-seven environmental subsets were designed. Each data set is calculated using both RF and SGT to produce an optimal set of variables. The simulation results from 70.37% (19/27) of the subsets showed that the predicted value of soil salinity from SGT is closer to the observed value than is that from RF. The statistics of all datasets showed that the average values of R2 value for RF and SGT were 0.38 and 0.40, the average Root Mean Squared Error (RMSE) value were 28.59 and 27.46, and the Ratio of Prediction to Deviation (RPD) averages were 1.20 and 1.24, respectively. The important dominant factor were topographic variables with coarse resolution, temperature and vegetation indices, land use and landform.
Keywords