Electronics Letters (Dec 2022)
More accurate cardinality estimation in data streams
Abstract
Abstract Many sketches based on estimator sharing have been proposed to estimate cardinality with huge flows in data streams. However, existing sketches suffer from large estimation errors due to allocating the same memory size for each estimator without considering the skewed cardinality distribution. Here, a filtering method called SuperFilter is proposed to enhance existing sketches. SuperFilter intelligently identifies high‐cardinality flows from the data stream, and records them with the large estimator, while other low‐cardinality flows are recorded using a traditional sketch with small estimators. The experimental results show that SuperFilter can reduce the average absolute error of cardinality estimation by over 81% compared with existing approaches.
Keywords