Water Supply (Dec 2022)
Estimating sewage flow rate in Jefferson County, Kentucky, using machine learning for wastewater-based epidemiology applications
Abstract
Direct measurement of the flow rate in sanitary sewer lines is not always feasible and is an important parameter for the normalization of data used in wastewater-based epidemiology applications. Machine learning to estimate past wastewater influent flow rates supporting public health applications has not been studied. The aim of this study was to assess wastewater treatment plant influent flow rates when compared with weather data and to retrospectively estimate flow rates in Louisville, Kentucky (USA), based on other data-types using machine learning. A random forest model was trained using a range of variables, such as feces-related indicators, weather data that could be associated with dilution in sewage systems, and area demographics. The developed algorithm successfully estimated the flow rate with an accuracy of 91.7%, although it did not perform as well with short-term (one-day) high flow rates. This study suggests that using variables such as precipitation (mm/day) and population size are more important for wastewater flow estimation. The fecal indicator concentration (cross-assembly phage and pepper mild mottle virus) was less important. Our study challenges currently accepted opinions by showing the important public health potential application of artificial intelligence in wastewater treatment plant flow rate estimation for wastewater-based epidemiological applications. HIGHLIGHTS Machine learning to estimate wastewater influent flow rates has not been studied for wastewater-based epidemiology applications.; Five wastewater treatment plants in Louisville, KY, USA, were studied to provide training and testing data sets of measured flow.; The random forest algorithm to estimate past flow rate had a 91.7% accuracy.; Artificial intelligence has potential applications in wastewater-based epidemiology.;
Keywords