npj Clean Water (Oct 2023)

Clustering micropollutants and estimating rate constants of sorption and biodegradation using machine learning approaches

  • Seung Ji Lim,
  • Jangwon Seo,
  • Mingizem Gashaw Seid,
  • Jiho Lee,
  • Wondesen Workneh Ejerssa,
  • Doo-Hee Lee,
  • Eunhoo Jeong,
  • Sung Ho Chae,
  • Yunho Lee,
  • Moon Son,
  • Seok Won Hong

DOI
https://doi.org/10.1038/s41545-023-00282-6
Journal volume & issue
Vol. 6, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Effluent from wastewater treatment plants is considered an important source of micropollutants (MPs) in aquatic environments. However, monitoring MPs in effluents is often inefficient owing to the variety in their types. Thus, this study derived marker constituents to estimate the behavior of MPs in each cluster using the self-organizing map (SOM), a machine learning-based clustering analysis method. In SOM analysis, the physicochemical properties, functional groups, and the initial biotransformation rules of 29 out 42 MPs were used to ultimately estimate the degradation rate constants of 13 MPs. Consequently, when the physicochemical properties and functional groups were considered, SOM analysis showed outstanding performance to label MPs with an accuracy value of 0.75 for each aerobic and anoxic condition. Based on the clustering results, 11 MPs were determined to be marker constituents under each aerobic and anoxic condition. Moreover, an estimation method for the rate constants of unlabeled MPs was successfully developed using the identified markers with the random forest classifier. The proposed algorithm could estimate both sorption and biotransformation of MPs regardless of dominant removal mechanisms, whether the MPs were removed by sorption or biotransformation. An accuracy of 0.77 was calculated for estimating rate constants under both aerobic and anoxic conditions, which is remarkably higher than those reported previously. The proposed procedure could be extended further to efficiently monitor MPs in effluents.