Ecological Indicators (Aug 2024)

Building an XGBoost model based on landscape metrics and meteorological data for nonpoint source pollution management in the Nakdong river watershed

  • Sun Hee Shim,
  • Jung Hyun Choi

Journal volume & issue
Vol. 165
p. 112156

Abstract

Read online

Purpose: To effectively operate river water quality under current fourth-phase Total Maximum Daily Load (TMDL) management system, we built a machine learning model that predicts whether water quality goals are achieved for the entire Nakdong river watershed in Korea. Methods: First, to consider the effects of land use type on the runoff characteristics of pollutants, K-means clustering was used to classify the watershed into three areas: agricultural areas, forest areas, and urban areas. Next, we developed a machine learning model to predict the achievement of BOD, TP, and TOC water quality goals in the different rainfall seasons. At this time, the Isolated Forest and ADASYN machine learning techniques were used to preprocess the training data. Finally, SHAP was used to find the factors with the greatest effects on the achievement of water quality goals. Results: This model’s average prediction results for TP, BOD, and TOC showed accuracy ranging from 0.6 to 1.0. Meteorological factors, particularly monthly precipitation and average temperature, were found to highly influence the model predictions for all land use types. In the landscape metrics, ED showed a high level of importance in all land use types. CONTAG was the main factor in agricultural areas; ED, LPI, CONTAG, COHESION and SHDI were the main factors in forest areas; and PD, ED, SHDI, and COHESION were the main factors in urban areas. Conclusion: The monthly precipitation and average temperature significantly affected whether the TMDL water quality goals were achieved in all sub watersheds, and the landscape metrics calculated as highly influenced factors differed depending on the land use type. Therefore, customized watershed management according to land use characteristics is necessary. These results provide valuable ideas for land use managers and landscape planners to achieve water quality goals through the management of non-point source pollution.

Keywords