Environment and Natural Resources Journal (Nov 2024)
A Two-Stage Feature Selection Method to Enhance Prediction of Daily PM2.5 Concentration Air Pollution
Abstract
In recent decades, air pollution has negatively affected human health and the environment. One of the important features contributing to air pollution is called PM2.5. However, daily prediction of PM2.5 is still lacking, especially using feature selection infused into the model. Hence, the main objective of this research is to utilize the feature selection procedures by proposing two stages feature selection methods namely adjusted correlation sharing t-test (adjcorT) and radial basis function neural network (RBFNN) in identifying the important features. This consequently also helps enhance the prediction of daily PM2.5 concentrations. Secondary data were obtained from the Department of Environment Malaysia (DOE) from 2018 until 2022 that consists of 5 years of air pollutant daily data. The results found that adjcorT-RBFNN identified the NO2, PM2.5, PM10, CO, O3, wind speed and SO2 as important features. The finding revealed that the accuracy, sensitivity, specificity, precision, F1 score and AUROC value, for a day-ahead prediction in Shah Alam are 0.756, 0.801, 0.717, 0.717, 0.757, and 0.758 respectively. Additionally, the predicted model may serve as an instrument for an early warning system, providing local authorities with information on air quality for formulation of strategies of air quality improvement.
Keywords