پژوهش‌های آبخیزداری (Sep 2019)

Evaluating the Efficiency of K nearest Neighbor and Fuzzy C-Means Clustering Based Methods in the Outputs of Hydrological Models

  • Rozhind Delpasand,
  • Aboalhasan Fathabadi,
  • Hamed Rouhani,
  • Seyed Morteza Seyedian

DOI
https://doi.org/10.22092/wmej.2019.124804.1187
Journal volume & issue
Vol. 32, no. 3
pp. 63 – 77

Abstract

Read online

Because of incomplete model input and imperfections of the model structure there is not any single hydrological model that has the best performance in different conditions and present outputs without uncertainty. In this situation by combining individual models outputs, the strengths of each single model are used to make a new model that performs better than each single model. In this study the efficiency of nonparametric K nearest neighbor and the Fuzzy C-Means clustering based methods were compared with BGA (Bates Granger Averaging), GRA (Granger Ramanathan Averaging), AICA (Akaike Information Criterion), BICA(Bayes Information Criterion), equal weights averaging and lasso methods in averaging output of hydrological models GR5J, SimHyd , SACRAMENTO and SMAR. Firstly, using the amount of rainfall, evapotranspiration, temperature, and the daily discharge of the Kasilian Watershed in Pol Sefid city at the Bon Koh Station was simulated by each hydrological model. Then different model averaging methods were used to combine the output of each single model. Results indicated that for the calibration period, the GR5J and SACRAMENTO, and the correlation coefficient, Nash Sutcliffe efficiency and RMSE were 0.83, 0.69 and 0.24 respectively. The models SimHyd and GR5J performed better for the validation period; the correlation coefficient, Nash Sutcliffe efficiency and RMSE were 0.73, 0.27 and 0.52 respectively. The lasso and GRA model averaging had the best performance for the calibration period and for the validation data equal weights averaging and BGA had the best performance. For calibration data K nearest neighbor performed better than fuzzy K means clustering based method and the best performance for two methods was obtained at 20 neighbors and for validation data fuzzy K means clustering based method performed better and it observed model performance was improved as the number of neighbors was increased.

Keywords