Journal of Big Data (Feb 2019)

Data mining approach for predicting the daily Internet data traffic of a smart university

  • Aderibigbe Israel Adekitan,
  • Jeremiah Abolade,
  • Olamilekan Shobayo

DOI
https://doi.org/10.1186/s40537-019-0176-5
Journal volume & issue
Vol. 6, no. 1
pp. 1 – 23

Abstract

Read online

Abstract Internet traffic measurement and analysis generate dataset that are indicators of usage trends, and such dataset can be used for traffic prediction via various statistical analyses. In this study, an extensive analysis was carried out on the daily internet traffic data generated from January to December, 2017 in a smart university in Nigeria. The dataset analysed contains seven key features: the month, the week, the day of the week, the daily IP traffic for the previous day, the average daily IP traffic for the two previous days, the traffic status classification (TSC) for the download and the TSC for the upload internet traffic data. The data mining analysis was performed using four learning algorithms: the Decision Tree, the Tree Ensemble, the Random Forest, and the Naïve Bayes Algorithm on KNIME (Konstanz Information Miner) data mining application and kNN, Neural Network, Random Forest, Naïve Bayes and CN2 Rule Inducer algorithms on the Orange platform. A comparative performance analysis for the models is presented using the confusion matrix, Cohen’s Kappa value, the accuracy of each model, Area under ROC Curve, etc. A minimum accuracy of 55.66% was observed for both the upload and the download IP data on the KNIME platform while minimum accuracies of 57.3% and 51.4% respectively were observed on the Orange platform.

Keywords