Ecological Informatics (Nov 2024)
A new approach to estimate total nitrogen concentration in a seasonal lake based on multi-source data methodology
Abstract
Nitrogen, a key limiter in lake eutrophication, presents serious threats to both human health and ecological balance. Despite its non-optically active nature, this study introduces an advanced retrieval approach for total nitrogen, utilizing a synthesis of multi-source data and sophisticated machine learning algorithms to markedly boost estimation precision. This innovative method integrates environmental variables, such as water temperature, depth, and flow rate with spectral reflectance, significantly enhancing the predictive accuracy of our machine learning models with high stability. The models tested, including Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting (XGB), Multilayer Perceptron (MLP), and Convolutional Neural Networks (CNN), with XGB outperforming others by achieving robust metrics: an R2 of 0.78, a Mean Absolute Error (MAE) of 0.21 mg/L, and a Mean Absolute Percentage Error (MAPE) of 16.04 %. Applying the optimized XGB model, we documented fluctuations in nitrogen concentrations within Poyang Lake across different hydrological phases in 2021, revealing the lowest nitrogen levels during the flood season and the highest in low water periods, with high concentrations at the inlets of the North Branch of the Ganjiang River and the Raohe River estuaries. Monte Carlo simulations reveal that the model is not much sensitive to input feature errors, validating its stability. The approach proposed in this study may help more precise total nitrogen retrieval in other similar lake waters.