Agriculture (Apr 2023)

Research on Provincial-Level Soil Moisture Prediction Based on Extreme Gradient Boosting Model

  • Yifang Ren,
  • Fenghua Ling,
  • Yong Wang

DOI
https://doi.org/10.3390/agriculture13050927
Journal volume & issue
Vol. 13, no. 5
p. 927

Abstract

Read online

As one of the physical quantities concerned in agricultural production, soil moisture can effectively guide field irrigation and evaluate the distribution of water resources for crop growth in various regions. However, the spatial variability of soil moisture is dramatic, and its time series data are highly noisy, nonlinear, and nonstationary, and thus hard to predict accurately. In this study, taking Jiangsu Province in China as an example, the data of 70 meteorological and soil moisture automatic observation stations from 2014 to 2022 were used to establish prediction models of 0–10 cm soil relative humidity (RHs10cm) via the extreme gradient boosting (XGBoost) algorithm. Before constructing the model, according to the measured soil physical characteristics, the soil moisture observation data were divided into three categories: sandy soil, loam soil, and clay soil. Based on the impacts of various factors on the soil water budget balance, 14 predictors were chosen for constructing the model, among which atmospheric and soil factors accounted for 10 and 4, respectively. Considering the differences in soil physical characteristics and the lagged effects of environmental impacts, the best influence times of the predictors for different soil types were determined through correlation analysis to improve the rationality of the model construction. To better evaluate the importance of soil factors, two sets of models (Model_soil&atmo and Model_atmo) were designed by taking soil factors as optional predictors put into the XGBoost model. Meanwhile, the contributions of predictors to the prediction results were analyzed with Shapley additive explanation (SHAP). Six prediction effect indicators, as well as a typical drought process that happened in 2022, were analyzed to evaluate the prediction accuracy. The results show that the time with the highest correlations between environmental predictors and RHs10cm varied but was similar between soil types. Among these predictors, the contribution rates of maximum air temperature (Tamax), cumulative precipitation (Psum), and air relative humidity (RHa) in atmospheric factors, which functioned as a critical factor affecting the variation in soil moisture, are relatively high in both models. In addition, adding soil factors could improve the accuracy of soil moisture prediction. To a certain extent, the XGBoost model performed better when compared with artificial neural networks (ANNs), random forests (RFs), and support vector machines (SVMs). The values of the correlation coefficient (R), root mean square error (RMSE), mean absolute error (MAE), mean absolute relative error (MARE), Nash–Sutcliffe efficiency coefficient (NSE), and accuracy (ACC) of Model_soil&atmo were 0.69, 11.11, 4.87, 0.12, 0.50, and 88%, respectively. This study verified that the XGBoost model is applicable to the prediction of soil moisture at the provincial level, as it could reasonably predict the development processes of the typical drought event.

Keywords