Water Quality Research Journal (Aug 2022)

Application of machine learning approaches in predicting estuarine dissolved oxygen (DO) under a limited data environment

  • Mohammad Abu Zafer Siddik

DOI
https://doi.org/10.2166/wqrj.2022.002
Journal volume & issue
Vol. 57, no. 3
pp. 140 – 151

Abstract

Read online

The application of machine learning (ML) approaches to predict estuarine dissolved oxygen (DO) from a set of environmental covariates including nutrients remains unexplored due to nutrient data unavailability. Employing data from 12 southwest coastal Florida water quality stations, the applicability of four ML models – support vector machine (SVM), random forest (RF), decision tree, and Wang–Mendel – was examined in predicting DO under a limited nutrient data environment. Monthly water temperature, pH, salinity, total nitrogen (TN), and total phosphorus (TP) data were used for model development. The multiple linear regression model was trained as benchmarks to compare the ML model performances. The site-specific RF and SVM showed superior model efficiency (Nash–Sutcliffe Efficiency > 0.80) when all the predictor variables were used for model development. However, models trained without nutrients demonstrated reduced prediction accuracy. Modeling by synthesizing all site data under TN-limited, TP-limited, and TN- & TP-co-limited regimes illustrated a preferable performance of RF. Overall, the study rendered two crucial conclusions that could complement the existing approaches to estimate total daily loads for environmental management: (1) nutrients serve as a necessary predictor of estuarine DO dynamics and (2) RF performs better among the ML methods under a limited data environment. HIGHLIGHTS Machine learning application on predicting dissolved oxygen (DO) is performed.; DO prediction including nutrient as a driver under a limited data environment is applied.; Inclusion of nutrient clearly depicts the dynamics of estuary DO.;

Keywords