IEEE Access (Jan 2024)
Applications of Bigdata Technologies in the Comparison of BMTD and ARIMA Models for the Prediction of Internet Congestion
Abstract
The pace in the development and adoption of the new technologies for bigdata analytics has changed dramatically over the last several decades, and the amount of data being digitally ingested and stored is expanding exponentially and rapidly. These data include structured, semi-structured and unstructured, and come in different sizes and formats. To utilize these vast resources, the knowledge and the skills needed to manage and to convert it into information is crucial. In this paper, firstly, the commonly used technologies, platforms, computational tools and the techniques currently in use for the ingesting, processing, storing and analyzing bigdata are reviewed. Secondly, those technologies are utilized to predict internet congestion by employing the bivariate mixture transition distribution (BMTD), expectation–maximization (EM) algorithm and the autoregressive integrated moving average (ARIMA) models. BMTD models are very effective in capturing non-Gaussian and nonlinear features, such as bursts of activity and outliers, in a single unified model class. These models do not assume equally spaced, as well as independence, which are the key weaknesses of some other available time series and marked point processes models. Both the Weibull BMTD and the ARIMA models are very effective time series predictive models, but the comparison of their predictive performances is not yet addressed in the statistics and the machine learning literature.
Keywords