Algorithms (Sep 2011)
Approximating Frequent Items in Asynchronous Data Stream over a Sliding Window
Abstract
In an asynchronous data stream, the data items may be out of order with respect to their original timestamps. This paper studies the space complexity required by a data structure to maintain such a data stream so that it can approximate the set of frequent items over a sliding time window with sufficient accuracy. Prior to our work, the best solution is given by Cormode et al. [1], who gave an O (1/ε log W log (εB/ log W) min {log W, 1/ε} log |U|)- space data structure that can approximate the frequent items within an ε error bound, where W and B are parameters of the sliding window, and U is the set of all possible item names. We gave a more space-efficient data structure that only requires O (1/ε log W log (εB/ logW) log log W) space.
Keywords