IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)
Fuzzy Similarity Analysis of Effective Training Samples to Improve Machine Learning Estimations of Water Quality Parameters Using Sentinel-2 Remote Sensing Data
Abstract
Continuous monitoring of water quality parameters (WQPs) is crucial due to the global degradation of water quality, primarily caused by climate change and population growth. Typically, machine learning (ML) models are employed to retrieve WQPs, but they require a large amount of training samples to accurately capture the data relationships. Even with sufficient training data, discrepancies still exist between values of predicted and in-situ WQPs. This study proposes a fuzzy similarity analysis (FSA) technique to enhance ML estimates of WQPs by using the prediction errors in effective training samples. The method was successfully applied to retrieve turbidity (Turb) and specific conductance (SC) in Lake Houston, USA, using Sentinel-2 remote sensing data. Three ML algorithms, namely mixture density networks, support vector regression, and partial least squares regression, were tested to evaluate the method's effectiveness. The results showed that FSA significantly improved the accuracy of all ML predictions. This improvement resulted in up to a 9.15% reduction in mean absolute percentage error and a 12% increase in R2 for Turb, while for SC, the improvements were 5.47% in MAPE and 7% in R2. The adaptability of the proposed method to other WQPs, various satellite data, and different ML models is promising for monitoring water quality in inland waters.
Keywords