Measurement: Sensors (Jun 2024)
Machine learning approach to predict the turbidity of Saki Lake, Telangana, India, using remote sensing data
Abstract
Water quality is crucial for all life forms, yet water pollution is escalating. Monitoring water quality is essential to combat this challenge. This study introduces a precise and efficient approach to predict water turbidity levels using linear regression models and machine learning algorithms such as k-NN regression and decision trees. The model is trained using independent features like red band reflectance and NDTI. Hyperparameter tuning, utilizing grid search CV and repeated k-fold cross-validation, is applied to enhance the model's accuracy. The machine learning method was assessed with turbidity data measured from Saki Lake in Hyderabad, India, over four years (2014–2017) by the Telangana State Groundwater Department. Concurrently, Landsat-8 imagery from the USGS was employed for comprehensive analysis. The decision tree regression, optimized with hyperparameter tuning, outperformed the others, yielding an MAE of 3.246, an RMSE of 3.802, and a correlation coefficient (R2) of 0.776. This study validates the decision tree method's precision in forecasting water turbidity and its strong agreement with on-site measured values.