Water Supply (May 2021)
Machine learning method for quick identification of water quality index (WQI) based on Sentinel-2 MSI data: Ebinur Lake case study
Abstract
Surface water quality is an important factor affecting the ecological environment and human living environment. The monitoring of surface water quality by remote sensing monitoring technology can provide important research significance for water resources protection and water quality evaluation. Finding the optimal spectral index sensitive to water quality for remote sensing monitoring of water quality is extremely important for surface water quality analysis and treatment in the Ebinur Lake Basin in arid areas. This study used Sentinel-2MSI data at 10 m resolution to quickly monitor the water quality of the watershed. Through laboratory experiments and measurement data from the Ebinur Lake Basin, 22 water quality parameters (WQPs) were obtained. Through Z-score and redundancy analysis, 9 WQPs with significant contributions were extracted. Based on the remote sensing spectral band, four water indexes (NDWI, NWI, EWI, AWEI-nsh) and 2D modeling spectral index (DI, RI, NDI), the correlation analysis between WQPs and two kinds of spectral band indexes is carried out, and it is concluded that the overall correlation between WQP and 2D spectral modeling is more relevant. This paper calculates the evaluation and models the 2D spectrum of the Water Quality Index (WQI). The WQI is predicted and modeled through four machine learning algorithms (RF, SVM, PLSR, PLSR-SVM).The results show that the inversion effect of the two-dimensional spectral modeling index on water quality parameters (WQPs) is superior to that of the water index, and the correlation coefficient of the DI (R12-R1) SWIR-2 and BLUE band interpolation index reaches 0.787. On this basis, three kinds of two-dimensional spectral modeling indexes are used to inversely synthesize the WQI, and the correlation coefficient of the ratio index of the RI (R11/R8) SWIR-1 and near-infrared (NIR) bands is preferably 0.69. In the WQI prediction, the partial least squares regression support vector machine (PLSR-SVM) model in machine learning algorithms has good modeling and prediction effects (R2c = 0.873, R2v = 0.87), which can provide a good basis. The research results provide references for remote monitoring of surface water in arid areas, and provide a basis for water quality prediction and safety evaluation. HIGHLIGHTS Inversion of water quality in the Ebinur Lake Basin by Sentinel-2 MSI data.; Analysis of the contribution rate of different water quality parameters in water by Z-score and PCA.; Comparison of classical water quality index, correlation between spectral modeling and water quality parameters.; Modeling and predicting water quality index (WQI) using machine learning and linear correlation methods.;
Keywords