Big Data Mining and Analytics (Mar 2018)

Online Internet Traffic Monitoring System Using Spark Streaming

  • Baojun Zhou,
  • Jie Li,
  • Xiaoyan Wang,
  • Yu Gu,
  • Li Xu,
  • Yongqiang Hu,
  • Lihua Zhu

DOI
https://doi.org/10.26599/BDMA.2018.9020005
Journal volume & issue
Vol. 1, no. 1
pp. 47 – 56

Abstract

Read online

Owing to the explosive growth of Internet traffic, network operators must be able to monitor the entire network situation and efficiently manage their network resources. Traditional network analysis methods that usually work on a single machine are no longer suitable for huge traffic data owing to their poor processing ability. Big data frameworks, such as Hadoop and Spark, can handle such analysis jobs even for a large amount of network traffic. However, Hadoop and Spark are inherently designed for offline data analysis. To cope with streaming data, various stream-processing-based frameworks have been proposed, such as Storm, Flink, and Spark Streaming. In this study, we propose an online Internet traffic monitoring system based on Spark Streaming. The system comprises three parts, namely, the collector, messaging system, and stream processor. We considered the TCP performance monitoring as a special use case of showing how network monitoring can be performed with our proposed system. We conducted typical experiments with a cluster in standalone mode, which showed that our system performs well for large Internet traffic measurement and monitoring.

Keywords