Kuwait Journal of Science (Jan 2017)
Multi-Level Mining of Association Rules from Warehouse Schema
Abstract
The integration of data mining techniques with data warehousing is becoming an interesting domain. The reason behind this popularity is the ability to extract knowledge from large data sets. However, in current available techniques a big emphasis is put on solutions where data mining plays a front end role to data warehousing for mining of data. Very little work is done, in order to apply data mining techniques in design of data warehouses. While techniques like data clustering have been implied on multidimensional data to enhance the knowledge discovery process still a number of issues remain unresolved related to the multidimensional schema design. These issues include the manual process of selection of important facts and dimensions in high dimensional data environment, an activity which is a challenging job for human designers where data is available in large volume having many related variables. Moreover, the interestingness measures used are specific to the transactional database. In this research we propose a technique to select a subset of informative dimensions and fact variables to start the mining process. This selection results in mining of association rules which are measured for interestingness using advanced diversity measures. Our experimental results after implementation of method on two real word data sets taken from UCI machine learning website show that the rules discovered from the schema that we generated was more diverse and informative as compared to the rules discovered from typical data mining process used on the original data without schema imposed on it. We compared the results with a similar approach and it showed prominent improvement for importance and diversity deviation.