Semi-2DCAE: a semi-supervision 2D-CNN AutoEncoder model for feature representation and classification of encrypted traffic

Jun Cui; Longkun Bai; Guangxu Li; Zhigui Lin; Penggao Zeng

doi:10.7717/peerj-cs.1635

PeerJ Computer Science (Nov 2023)

Semi-2DCAE: a semi-supervision 2D-CNN AutoEncoder model for feature representation and classification of encrypted traffic

Jun Cui,
Longkun Bai,
Guangxu Li,
Zhigui Lin,
Penggao Zeng

Affiliations

Jun Cui: Tiangong University, School of Life Sciences, Tianjin, China
Longkun Bai: Tiangong University, School of Electronics and Information Engineering, Tianjin, China
Guangxu Li: Tiangong University, School of Electronics and Information Engineering, Tianjin, China
Zhigui Lin: Tiangong University, School of Electronics and Information Engineering, Tianjin, China
Penggao Zeng: Tiangong University, School of Life Sciences, Tianjin, China

DOI: https://doi.org/10.7717/peerj-cs.1635
Journal volume & issue: Vol. 9
p. e1635

Abstract

Read online Read online

Traffic classification is essential in network-related areas such as network management, monitoring, and security. As the proportion of encrypted internet traffic rises, the accuracy of port-based and DPI-based traffic classification methods has declined. The methods based on machine learning and deep learning have effectively improved the accuracy of traffic classification, but they still suffer from inadequate extraction of traffic structure features and poor feature representativeness. This article proposes a model called Semi-supervision 2-Dimensional Convolution AutoEncoder (Semi-2DCAE). The model extracts the spatial structure features in the original network traffic by 2-dimensional convolution neural network (2D-CNN) and uses the autoencoder structure to downscale the data so that different traffic features are represented as spectral lines in different intervals of a one-dimensional standard coordinate system, which we call FlowSpectrum. In this article, the PRuLe activation function is added to the model to ensure the stability of the training process. We use the ISCX-VPN2016 dataset to test the classification effect of FlowSpectrum model. The experimental results show that the proposed model can characterize the encrypted traffic features in a one-dimensional coordinate system and classify Non-VPN encrypted traffic with an accuracy of up to 99.2%, which is about 7% better than the state-of-the-art solution, and VPN encrypted traffic with an accuracy of 98.3%, which is about 2% better than the state-of-the-art solution.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords