IEEE Access (Jan 2018)

A Cloud-Based Parallel Space-Saving Algorithm for Big Networking Data

  • Dazhong He,
  • Yang Yang,
  • Jun Liu

DOI
https://doi.org/10.1109/ACCESS.2018.2865745
Journal volume & issue
Vol. 6
pp. 45886 – 45898

Abstract

Read online

As the network continues to evolve, completely analyzing the traffic requires immeasurable resources. In situations of processing enormous streaming data, the most significant k items (Top-k) are more interesting, and some streaming algorithms are deployed due to relatively limited memory and also limited processing time per item. Space-saving is such one of the most popular algorithms for computation of frequent and Top-k elements in data streams. In this paper, this algorithm is implemented in the cloud for analyzing big networking data, and an empirical formula of the counter number is derived for efficiently maintaining Top-k items. Meanwhile, easily understandable proof manner is presented to prove the merging ability of Space-saving algorithm, and some experiments are conducted to affirm the effectiveness of the algorithm.

Keywords