PLoS ONE (Jan 2024)
Comparing machine learning approaches for estimating soil saturated hydraulic conductivity.
Abstract
Characterization of near (field) saturated hydraulic conductivity (Kfs) of the soil environment is among the crucial components of hydrological modeling frameworks. Since the associated laboratory/field experiments are time-consuming and labor-intensive, pedotransfer functions (PTFs) that rely on statistical predictors are usually integrated with the existing measurements to predict Kfs in other areas of the field. In this study some of the most appropriate machine learning approaches, including variants of artificial neural networks (ANNs) were used for predicting Kfs by some easily measurable soil attributes. The analyses were performed using 100 measurements in Bajgah Agricultural Experimental Station. First, physico-chemical inputs as bulk density (BD), initial water content (Wi), saturated water content (Ws), mean weight diameter (MWD), and geometric mean diameter (GMD) of aggregates, pH, electrical conductivity (EC), and calcium carbonate equivalent (CCE) were measured. Then, radial basis functions (RBFNNs), multilayer perceptron (MLPNNs), hybrid genetic algorithm (GA-NNs), and particle swarm optimization (PSO-NNs) neural networks were utilized to develop PTFs and compared their accuracy with the traditional regression model (MLR) using statistical indices. The statistical assessment indicated that PSO-NNs with the lowest RMSE and MAPE as well as the highest correlation coefficient (R) value provided the most accurate and robust prediction of Kfs. The prediction models ranked as PSO-NNs (R = 0.958; RMSE = 0.343; MAPE = 9.47), GA-NNs (R = 0.949; RMSE = 0.404; MAPE = 11.83), MLPNNs (R = 0.933; RMSE = 0.426; MAPE = 12.13), RBFNNs (R = 0.926; RMSE = 0.452; MAPE = 14.30), and MLR (R = 0.675; RMSE = 0.685; MAPE = 22.54) in terms of their performances for the test data set. Results revealed that all NN models particularly PSO-NNs were efficient in prediction of Kfs. However, further evaluations may be recommended for other soil conditions and input variables to quantify their potential uncertainties and wider potential and versatility before they are used in other geographical locations/soil conditions.