مجله آب و خاک (Feb 2016)

Modeling of Soil Aggregate Stability using Support Vector Machines and Multiple Linear Regression

  • Ali Asghar Besalatpour,
  • H. Shirani,
  • E. Eafandiyarpour

DOI
https://doi.org/10.22067/jsw.v0i0.22620
Journal volume & issue
Vol. 29, no. 2
pp. 406 – 417

Abstract

Read online

Introduction: Soil aggregate stability is a key factor in soil resistivity to mechanical stresses, including the impacts of rainfall and surface runoff, and thus to water erosion (Canasveras et al., 2010). Various indicators have been proposed to characterize and quantify soil aggregate stability, for example percentage of water-stable aggregates (WSA), mean weight diameter (MWD), geometric mean diameter (GMD) of aggregates, and water-dispersible clay (WDC) content (Calero et al., 2008). Unfortunately, the experimental methods available to determine these indicators are laborious, time-consuming and difficult to standardize (Canasveras et al., 2010). Therefore, it would be advantageous if aggregate stability could be predicted indirectly from more easily available data (Besalatpour et al., 2014). The main objective of this study is to investigate the potential use of support vector machines (SVMs) method for estimating soil aggregate stability (as quantified by GMD) as compared to multiple linear regression approach. Materials and Methods: The study area was part of the Bazoft watershed (31° 37′ to 32° 39′ N and 49° 34′ to 50° 32′ E), which is located in the Northern part of the Karun river basin in central Iran. A total of 160 soil samples were collected from the top 5 cm of soil surface. Some easily available characteristics including topographic, vegetation, and soil properties were used as inputs. Soil organic matter (SOM) content was determined by the Walkley-Black method (Nelson & Sommers, 1986). Particle size distribution in the soil samples (clay, silt, sand, fine sand, and very fine sand) were measured using the procedure described by Gee & Bauder (1986) and calcium carbonate equivalent (CCE) content was determined by the back-titration method (Nelson, 1982). The modified Kemper & Rosenau (1986) method was used to determine wet-aggregate stability (GMD). The topographic attributes of elevation, slope, and aspect were characterized using a 20-m by 20-m digital elevation model (DEM). The data set was divided into two subsets of training and testing. The training subset was randomly chosen from 70% of the total set of the data and the remaining samples (30% of the data) were used as the testing set. The correlation coefficient (r), mean square error (MSE), and error percentage (ERROR%) between the measured and the predicted GMD values were used to evaluate the performance of the models. Results and Discussion: The description statistics showed that there was little variability in the sample distributions of the variables used in this study to develop the GMD prediction models, indicating that their values were all normally distributed. The constructed SVM model had better performance in predicting GMD compared to the traditional multiple linear regression model. The obtained MSE and r values for the developed SVM model for soil aggregate stability prediction were 0.005 and 0.86, respectively. The obtained ERROR% value for soil aggregate stability prediction using the SVM model was 10.7% while it was 15.7% for the regression model. The scatter plot figures also showed that the SVM model was more accurate in GMD estimation than the MLR model, since the predicted GMD values were closer in agreement with the measured values for most of the samples. The worse performance of the MLR model might be due to the larger amount of data that is required for developing a sustainable regression model compared to intelligent systems. Furthermore, only the linear effects of the predictors on the dependent variable can be extracted by linear models while in many cases the effects may not be linear in nature. Meanwhile, the SVM model is suitable for modelling nonlinear relationships and its major advantage is that the method can be developed without knowing the exact form of the analytical function on which the model should be built. All these indicate that the SVM approach would be a better choice for predicting soil aggregate stability. Conclusion: The pixel-scale soil aggregate stability predicted that using the developed SVM and MLR models demonstrates the usefulness of incorporating topographic and vegetation information along with the soil properties as predictors. However, the SVM model achieved more accuracy in predicting soil aggregate stability compared to the MLR model. Therefore, it appears that support vector machines can be used for prediction of some soil physical properties such as geometric mean diameter of soil aggregates in the study area. Furthermore, despite the high predictive accuracy of the SVM method compared to the MLR technique which was confirmed by the obtained results in the current study, the advantages of the SVM method such as its intrinsic effectiveness with respect to traditional prediction methods, less effort in setting up the control parameters for architecture design, the possibility of solving the learning problem according to constrained quadratic programming methods, etc., should motivate soil scientists to work on it further in the future.

Keywords