International Journal of Cognitive Computing in Engineering (Jun 2020)

A spatial feature engineering algorithm for creating air pollution health datasets

  • Raja Sher Afgun Usmani,
  • Thulasyammal Ramiah Pillai,
  • Ibrahim Abaker Targio Hashem,
  • Noor Zaman Jhanjhi,
  • Anum Saeed,
  • Akibu Mahmoud Abdullahi

Journal volume & issue
Vol. 1
pp. 98 – 107

Abstract

Read online

Air pollution is one of the significant causes of mortality and morbidity every year. In recent years, many researchers have focused their attention on the associations of air pollution and health. Air pollution data and health data is used in these studies and feature engineering is used to create and optimize the air quality and health features. In order to associate these datasets, the residential address, community/county/block/city, and hospital/school address are utilized as association parameters. A spatial problem is raised when the Air Quality Monitoring (AQM) stations are concentrated in urban areas within the regions, and the residential address or any other spatial parameter is used. An intersection of AQM stations coverage in urban areas is observed where AQM stations are operating in close proximity, which raises the question of how to associate the patients with the relevant AQM station. In most studies, the distance of patients to the AQM stations is also not taken into account. In this study, we propose a spatial feature engineering algorithm with functions to find the coordinates for patients, calculate distances to the AQM stations, and associate patient records to the nearest AQM station. Hence, removing the limitations of current air pollution health datasets. The proposed algorithm is applied to a case study in Klang Valley, Malaysia. The results show that the proposed algorithm can generate air pollution health datasets efficiently, and it also provides the radius facility to exclude the patients who are situated far away from the stations.

Keywords