مجله آب و خاک (Feb 2024)

Digital Mapping of Soil Texture Particles with Machine Learning Models and Environmental Covariates

  • P. Khosravani,
  • M. Baghernejad,
  • A.A. Moosavi,
  • S.R. Fallah Shamsi

DOI
https://doi.org/10.22067/jsw.2023.84413.1331
Journal volume & issue
Vol. 37, no. 6
pp. 923 – 942

Abstract

Read online

IntroductionUnderstanding the particle size distribution (PSD) is of great importance for plant growth and soil management. In recent years, the science of soil has witnessed a significant increase in digital soil mapping (DSM) activities. In this regard, machine learning models (ML) have emerged as an alternative and tool for DSM, which are mainly used for data mining and pattern recognition purposes, and are now widely used for regression and classification tasks in all fields of science. Hence, this study was undertaken to spatially model sand, silt, and clay particles utilizing machine learning models such as Random Forest (RF), Support Vector Regression (SVR), and the Co-Kriging geostatistical model. Additionally, auxiliary variables with high spatial resolution were incorporated into the analysis. This investigation was conducted in a section of the Marvdasht plain, located in Fars province. Materials and MethodsThe present study was conducted in a part of Marvdasht plain located between 35.82´41°52' to 1.07´57°52' east longitude and 35.02´48°29' to 14.72´2°30' north latitude, and 40 km north of Shiraz with an area of about 50,000 hectares. After determining the study area boundaries, the positions of 200 sampling points were determined using the R software and the conditioned Latin hypercube sampling method. In other words, for soil feature modeling, 200 samples were taken from two depths of zero to 30 and 30 to 60 centimeters in the study area. Then, the samples were transferred to the laboratory, dried, and passed through a 2 mm sieve. Finally, the soil texture components were measured by the hydrometer method. The environmental variables used in this study are a wide range of representatives of soil-forming factors that were prepared as much as possible from sources with minimum cost and high accessibility. In total, 75 environmental variables were prepared, and the raster format related to all environmental variables, including 39 elevation and altitude variables and 36 remote sensing measurement variables, was extracted. Finally, the factor-tuning inflation variance and Boruta algorithm were used to select the optimal variables. ResultsThe minimum amount of clay was measured at 10.21% and 10.45%, respectively, and the maximum amount was 32.65% and 36.35% at the surface and subsurface depths. The average amount of clay in all samples was 37.91% and 35.61%. The average amount of sand was measured at 25.65% and 26.02% at the surface and subsurface depths, respectively. The maximum amount of sand was observed in the northern and higher parts of the study area and was equal to 54.68% and the minimum amount was predicted in the low-lying areas of the study area. Low-lying areas and sedimentary plains in the central part of the study area contained high amounts of silt. Four depth variables valley depths (VD), texture (TE), topographic wetness index (TWI), and clay index (CI) related to geomorphometric parameters and the normalized difference vegetation index (NDVI) variable related to remote sensing indices were selected as optimal variables. The RF model with R2 of 54.0% and 36.0% for predicting sand, 48.0% and 64.0% for predicting silt, and 52.0% and 49.0% for predicting clay at both surface and subsurface depths performed better than the SVR and Co-Kriging models. The most effective variable in predicting the spatial distribution of soil particles was VD with relative importance of 60% and 65% for predicting sand at the surface and subsurface depths, 70% for predicting silt at the surface depth, and 70% and 65% for predicting clay at both surface and subsurface depths, respectively. Only TE and TWI variables were more important than VD for predicting silt at subsurface depth. These results show that topographic variables are effective in the spatial variation of soil particles. Unlike clay, the highest amount of sand in both depths was observed in the northern part and the highest part of the study area, and the lowest amount was predicted in the low-lying areas of the study area. ConclusionIn general, with the aim of this research, maps of the spatial distribution of soil texture components were prepared at both surface and subsurface depths using machine learning and geostatistical approaches along with environmental covariates in a part of Marvdasht plain. Among the selected environmental covariates, topographic attributes, especially the valley depth (VD), had the highest effect in justifying the spatial prediction of soil texture components. Also, the results of comparing the performance of machine learning models supported the higher efficiency of the RF model than other models. Therefore, the approach used in this study to prepare a map of soil texture components can be useful as a guide for mapping useful soil features in areas with similar climatic and topographic conditions.

Keywords