Emerging Science Journal (Jun 2021)

Itemset Representation and Mining the Rules for Huntington’s Dataset

  • Carynthia Kharkongor,
  • Bhabesh Nath

DOI
https://doi.org/10.28991/esj-2021-01284
Journal volume & issue
Vol. 5, no. 3
pp. 380 – 391

Abstract

Read online

Association rule mining does not restrict to market basket application but it is also employed in many applications such as health, industrial, network domain and etc. In this paper, an association mining algorithm is applied to the health management domain. It helps in the decision making by producing the rules for the early detection of the disease. By checking the personal details and symptoms of the patient, association rule mining will help in prediction and diagnosing the disease at an early stage. The dataset used in this experiment is the Huntington Disease (HD) dataset, which is one of the rare diseases. The dataset needs to be stored in the memory for the computation and generation of rules. Storing the items in the memory will take 4 bytes if the array data structure is used. Furthermore, if the dataset is very large, storing each and every detail in the memory becomes speculative. It is also not cost-effective and consumes a lot of resources. One of the solutions is to present the itemset in such a way that the memory consumed is concise. The items are represented using the set representation that takes less time and memory as compared to the traditional methods. The dataset is mine using the Apriori Algorithm which produces only those itemsets which are more frequent or have a high probability of occurrence. The algorithm gives a prior knowledge of the frequent itemsets. Then, the rules will be generated from these frequent itemsets. The memory and time consumption using the set representation is compared with the array representation of itemsets. Doi: 10.28991/esj-2021-01284 Full Text: PDF

Keywords