Data-driven modeling techniques for prediction of settled water turbidity in drinking water treatment

Sean McKelvey; Amirhassan Abassi; C. Nataraj; Metin Duran

doi:10.3389/fenve.2024.1401180

Frontiers in Environmental Engineering (May 2024)

Data-driven modeling techniques for prediction of settled water turbidity in drinking water treatment

Sean McKelvey,
Amirhassan Abassi,
C. Nataraj,
Metin Duran

Affiliations

Sean McKelvey: Philadelphia Water Department, Planning and Research Unit, Philadelphia, PA, United States
Amirhassan Abassi: Villanova Center for Analytics of Dynamic Systems (VCADS), Department of Mechanical Engineering, Villanova University, Villanova, PA, United States
C. Nataraj: Villanova Center for Analytics of Dynamic Systems (VCADS), Department of Mechanical Engineering, Villanova University, Villanova, PA, United States
Metin Duran: Villanova Civil and Environmental Engineering Department, Villanova University, Villanova, PA, United States

DOI: https://doi.org/10.3389/fenve.2024.1401180
Journal volume & issue: Vol. 3

Abstract

Read online

Drinking water treatment is a complex system of chemical, physical, and biological processes that is highly dependent on water quality and the design of the treatment process. To create decision-support tools, the prediction of key performance indicators, such as settled water turbidity, is needed. A variety of data-driven modeling techniques is available to formulate such predictions. Data-driven models provide valuable tools for formulating predictions where there is a lack of mechanistic models or the mechanisms are not fully understood, as in surface water treatment. The objective of this paper is to evaluate and compare the effectiveness of various data-driven techniques for this important, but difficult, problem. Recognizing that the size and quality of the dataset are most critical in this kind of analysis, this work uses one of the largest datasets used in this context consisting of 2,527 vectors of water quality and operational data (2,527 X nine data frame) from a full-scale water treatment plant. The paper constructs and compares the performance of the several data-driven models including k-nearest neighbor (KNN) regression, polynomial regression, and artificial neural networks (ANN). Based on test scaled root mean square error (RMSE), the ANN model was the most predictive (0.124). Similarly, the ANN model had the best predictive performance based on total scaled RMSE (0.086). These results show that ANNs have a high potential for the development of a future decision support system in selecting appropriate coagulant doses based on settled water turbidity.

Published in Frontiers in Environmental Engineering

ISSN: 2813-5067 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General): Environmental engineering; Technology: Environmental technology. Sanitary engineering
Website: https://www.frontiersin.org/journals/environmental-engineering

About the journal

Abstract

Keywords