Digital Communications and Networks (Aug 2023)
Network traffic identification in packet sampling environment
Abstract
With the rapid growth of network bandwidth, traffic identification is currently an important challenge for network management and security. In recent years, packet sampling has been widely used in most network management systems. In this paper, in order to improve the accuracy of network traffic identification, sampled NetFlow data is applied to traffic identification, and the impact of packet sampling on the accuracy of the identification method is studied. This study includes feature selection, a metric correlation analysis for the application behavior, and a traffic identification algorithm. Theoretical analysis and experimental results show that the significance of behavior characteristics becomes lower in the packet sampling environment. Meanwhile, in this paper, the correlation analysis results in different trends according to different features. However, as long as the flow number meets the statistical requirement, the feature selection and the correlation degree will be independent of the sampling ratio. While in a high sampling ratio, where the effective information would be less, the identification accuracy is much lower than the unsampled packets. Finally, in order to improve the accuracy of the identification, we propose a Deep Belief Networks Application Identification (DBNAI) method, which can achieve better classification performance than other state-of-the-art methods.