Frontiers in Environmental Science (Apr 2024)

Air pollution concentration fuzzy evaluation based on evidence theory and the K-nearest neighbor algorithm

  • Bian Chao,
  • Bian Chao,
  • Huang Guang Qiu

DOI
https://doi.org/10.3389/fenvs.2024.1243962
Journal volume & issue
Vol. 12

Abstract

Read online

Background: Air pollution, characterized by complex spatiotemporal dynamics and inherent uncertainty, poses significant challenges in accurate air quality prediction, and current methodologies often fail to adequately address these complexities.Objective: This study presents a novel fuzzy modeling approach for estimating air pollution concentrations.Methods: This fuzzy evaluation method integrates an improved evidence theory with comprehensive weighting and the K-nearest neighbor (KNN) interval distance within the framework of the matter-element extension model. This involves generating the basic probability assignment (BPA) based on interval similarity, performing sequential fusion using the Dempster–Shafer evidence theory, enhancing the fusion results via comprehensive weighting, and conducting fuzzy evaluation of air pollution concentrations using the matter-element extension KNN interval distance.Results: Our method achieved significant improvements in monitoring air pollution concentrations, incorporating spatiotemporal factors and pollutant concentrations more effectively than existing methods. Implementing sequential fusion and subjective–objective weighting reduced the error rate by 38% relative to alternative methods.Discussion: Fusion of multi-source air pollution data via this method effectively mitigates inherent uncertainty and enhances the accuracy of the KNN method. It produces more comprehensive air pollution concentration fusion results, improving accuracy by considering spatiotemporal correlation, toxicity, and pollution levels. Compared to traditional air-quality indices, our approach achieves greater accuracy and better interpretability, making it possible to develop more effective air quality management strategies. Future research should focus on expanding the dataset to include more diverse geographical and meteorological conditions, further refining the model to integrate external factors like meteorological data and regional industrial activity, and improving computational efficiency for real-time applications.

Keywords