IEEE Access (Jan 2019)
Frequent Pattern Mining on Time and Location Aware Air Quality Data
Abstract
With the advent of big data era, enormous volumes of data are generated every second. Varied data processing algorithms and architectures have been proposed in the past to achieve better execution of data mining algorithms. One such algorithm is extracting most frequently occurring patterns from the transactional database. Dependency of transactions on time and location further makes frequent itemset mining task more complex. The present work targets to identify and extract the frequent patterns from such time and location-aware transactional data. Primarily, the spatio-temporal dependency of air quality data is leveraged to find out frequently co-occurring pollutants over several locations of Delhi, the capital city of India. Varied approaches have been proposed in the past to extract frequent patterns efficiently, but this work suggests a generalized approach that can be applied to any numeric spatio-temporal transactional data, including air quality data. Furthermore, a comprehensive description of the algorithm along with a sample running example on air quality dataset is shown in this work. A detailed experimental evaluation is carried out on the synthetically generated datasets, benchmark datasets, and real world datasets. Furthermore, a comparison with spatio-temporal apriori as well as the other state-of-the-art non-apriori-based algorithms is shown. Results suggest that the proposed algorithm outperformed the existing approaches in terms of execution time of algorithm and memory resources.
Keywords