Mining frequent itemsets from streaming transaction data using genetic algorithms

Sikha Bagui; Patrick Stanley

doi:10.1186/s40537-020-00330-9

Journal of Big Data (Jul 2020)

Mining frequent itemsets from streaming transaction data using genetic algorithms

Sikha Bagui,
Patrick Stanley

Affiliations

Sikha Bagui: Department of Computer Science, University of West Florida
Patrick Stanley: Department of Computer Science, University of West Florida

DOI: https://doi.org/10.1186/s40537-020-00330-9
Journal volume & issue: Vol. 7, no. 1
pp. 1 – 20

Abstract

Read online

Abstract This paper presents a study of mining frequent itemsets from streaming data in the presence of concept drift. Streaming data, being volatile in nature, is particularly challenging to mine. An approach using genetic algorithms is presented, and various relationships between concept drift, sliding window size, and genetic algorithm constraints are explored. Concept drift is identified by changes in frequent itemsets. The novelty of this work lies in determining concept drift using frequent itemsets for mining streaming data, using the genetic algorithm framework. Formulas have been presented for calculating minimum support counts in streaming data using sliding windows. Testing highlighted that the ratio of the window size to transactions per drift was a key to good performance. Getting good results when the sliding window size was too small was a challenge since normal fluctuations in the data could appear to be a concept drift. Window size must be managed in conjunction with support and confidence values in order to achieve reasonable results. This method of detecting concept drift performed well when larger window sizes were used.

Published in Journal of Big Data

ISSN: 2196-1115 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofbigdata.springeropen.com

About the journal

Abstract

Keywords