IEEE Access (Jan 2024)

Machine Learning-Based Cellular Traffic Prediction Using Data Reduction Techniques

  • Heba Nashaat,
  • Nihal H. Mohammed,
  • Salah M. Abdel-Mageid,
  • Rawya Y. Rizk

DOI
https://doi.org/10.1109/ACCESS.2024.3392624
Journal volume & issue
Vol. 12
pp. 58927 – 58939

Abstract

Read online

Estimating and analyzing traffic patterns become essential in managing Quality of Service (QoS) metrics while assessing internet data traffic in cellular networks. Cellular network planners frequently apply various approaches to predict network traffic. However, most existing studies focus on using the available local data to jointly build prediction models, facing data security challenges and time complexity, especially with multi-dimensional datasets. Therefore, this paper proposes a framework to handle traffic prediction with the considerable potential of Machine Learning (ML) algorithms. An Adaptive Machine Learning-based Cellular Traffic Prediction (AML-CTP) framework is presented to select a suitable ML algorithm for multi-dimensional datasets. Its objective is to streamline and speed up the selection of an appropriate model for predicting network traffic load. The framework employs two density-based clustering algorithms to categorize similar nearby traffic into various clusters, considering data similarity and convergence. Additionally, it assesses data quality and homogeneity by training models with data samples from each cluster to accurately determine the most suitable machine learning model. The optimal model is selected from four supervised predicting algorithms, reducing training time and hardware complexity. Two case studies from a popular telecommunication equipment corporation in Egypt are implemented using real-life cellular traffic with multi-dimensional features. The case studies show that the framework can help reduce the computational cost of training the model and reduce the risk of overfitting. The experimental results show that selecting the best prediction model training could save up to 85% of computational time compared to two state-of-the-art techniques while achieving an accuracy of 98.8%.

Keywords