IEEE Access (Jan 2021)

Comparison of Machine Learning Techniques Applied to Traffic Prediction of Real Wireless Network

  • Daria Alekseeva,
  • Nikolai Stepanov,
  • Albert Veprev,
  • Alexandra Sharapova,
  • Elena Simona Lohan,
  • Aleksandr Ometov

DOI
https://doi.org/10.1109/ACCESS.2021.3129850
Journal volume & issue
Vol. 9
pp. 159495 – 159514

Abstract

Read online

Today, the traffic amount is growing inexorably due to the increase in the number of devices on the network. Researchers analyze traffic by identifying sophisticated dependencies, anomalies, and novel traffic patterns to improve the performance of systems as a whole. One of the fast development niches in this domain is related to Classic and Deep Machine Learning techniques that are supposed to improve the network operation in the most complex heterogeneous environment. In this work, we first outline existing applications of Machine Learning in the communications domain and further list the most significant challenges and potential solutions while implementing those. Finally, we compare different classical methods predicting the traffic on the LTE network Edge by utilizing such techniques as Linear Regression, Gradient Boosting, Random Forest, Bootstrap Aggregation (Bagging), Huber Regression, Bayesian Regression, and Support Vector Machines (SVM). We develop the corresponding Machine Learning environment based on a public cellular traffic dataset and present a comparison table of the quality metrics and execution time for each model. After the analysis, the SVM method proved to allow for a much faster training compared to other algorithms. Gradient Boosting showed the best quality of predictions as it has the most efficient data determination. Random forest shows the worst result since it depends on the number of features that may be limited. The probabilistic approach-based Bayesian regression method showed slightly worse results than Gradient Boosting, but its training time was shorter. The performance evaluation demonstrated good results for linear models with the Huber loss function, which optimizes the model parameters better. As a standalone contribution, we offer the source code of the analyzed algorithms in Open Access.

Keywords