IEEE Access (Jan 2023)
A Feasible and Explainable Network Traffic Classifier Utilizing DistilBERT
Abstract
While user-oriented service industries are rapidly growing, various network devices provide these services through different access paths. Accordingly, the network flow is also increasing explosively. As demand for management related to limited network resources increases, the network traffic classification grows to prominence. Usually, a quick classification task was possible with hundreds of data composed of dozens of features. Afterward, deep learning models have proliferated owing to an outstanding performance that overwhelms existing performance based on hundreds of thousands of features and data. However, the deep learning models showing one of the best performances cannot be free from two facts. One is a lot of time and resource consumption. The other is an uncertain explanation of the process. We solved these problems. Firstly, we used two methods to overcome resource constraints. We modified the DistilBERT applied with knowledge distillation for using a compressed model and securing a remarkable performance. We used a lightweight packet with a header and partial payload for feature reduction. Consequently, our XENTC can process four multi-attribute packets simultaneously and effectively by removing the superfluity of features. And it achieved 97.0~98.1% F1 scores. The required time to classify a packet using a trained model is 0.0093 seconds. Therefore, it can be one of the feasible solutions. Secondly, to approach human-understandable XAI, we analyzed the relationships between the features by associating them with the packet structure. At the specific point of the model’s finished training, it was revealed what the important features of the packet were by counting the Top-5 number of times among the attention values. In addition, we visualized the classification performance of the model using t-SNE to enable intuitive understanding.
Keywords