Water Science and Technology (Oct 2023)

Model evaluation of total phosphorus prediction based on model accuracy and interpretability for the surface water in the river network of the Jiangnan Plain, China

  • Hao Zhang,
  • Juan Huan,
  • Xiangen Xu,
  • Bing Shi,
  • Yongchun Zheng,
  • Weijia Mao,
  • Jiapeng Lv

DOI
https://doi.org/10.2166/wst.2023.310
Journal volume & issue
Vol. 88, no. 8
pp. 2108 – 2120

Abstract

Read online

Due to climatic and hydrological changes and human activities, eutrophication and frequent outbreaks of cyanobacteria are prominent in the Jiangnan Plain basin of China. Therefore, building a suitable model to accurately predict the phosphorus concentration in surface water is of practical significance to prevent the above problems. This study built 10 models to predict the phosphorus element in the surface water of the river network in the Jiangnan Plain. The main water types in the basin include the Yangtze River, the Beijing-Hangzhou Canal, and the Gehu Lake. The 10 models in different datasets have been comprehensively evaluated by the prediction accuracy and interpretability of the model, and the calculation of the partial dependence diagram (PDP) and SHAP has proved that there is a transparent response relationship between phosphorus and different factors. The results show that the Yangtze River, Beijing-Hangzhou Canal, and Gehu Lake are suitable for random forest, linear regression, and random forest models, respectively, under the comprehensive evaluation of the prediction accuracy and interpretability of the model. Models with low prediction accuracy often show strong interpretability. In different water body types, turbidity, water temperature, and chlorophyll-a are the three factors that affect the model in predicting phosphorus. HIGHLIGHTS Construct 10 models based on three datasets: the Yangtze River, the Beijing-Hangzhou Canal, and the Gehu Lake.; Three criteria for model interpretability were proposed and 10 models were ranked for interpretability.; It was found that water temperature, chlorophyll-a, and turbidity were the most influential factors in predicting total phosphorus in the three water quality datasets.;

Keywords