Telfor Journal (Dec 2022)
A Comparative Study of Deep Learning and Decision Tree Based Ensemble Learning Algorithms for Network Traffic Identification
Abstract
In this paper, we apply Deep Learning (DL) and decision-tree-based ensemble learning algorithms to classify network traffic by application. Various Deep Learning (DL) models for network traffic identification have been presented, implemented and compared, including 1D convolutional, stacked autoencoder, multi-layer perceptron, and combination of the aforementioned. Then the results of DL models have been compared to those obtained with two popular ensemble learning models based on decision trees - Random Forest and XGBoost. To train and test the classification models, a dataset containing both encrypted and unencrypted traffic has been collected in a real network, under normal operating conditions, and pre-processed in a way that ensures non-biased results. The classification uncertainties of the models have been also quantified on publicly available ISCX VPN-nonVPN dataset. The models have been compared in terms of precision, recall, F1 score and accuracy, for different levels of complexity and training dataset sizes. The evaluation results indicate that the decision-tree ensemble learning algorithms provide more accurate results and outperform the DL algorithms. The performance gap reduces with the dataset complexity.
Keywords