IEEE Access (Jan 2019)
FWFS: Selecting Robust Features Towards Reliable and Stable Traffic Classifier in SDN
Abstract
Real-time Internet traffic flow classification is important in managing network resources in accordance to Quality of Service (QoS) requirements. The centralized network's control in Software Defined Networking (SDN) provides a platform for Internet Service Provider (ISP) to perform specific actions on the classified flows through routing and scheduling. Though machine learning (ML) can be the alternative to Deep Packet Inspection (DPI) in classifying SDN traffic flows, several problems, such as classifier's accuracy, computational complexity, multi-class imbalanced data, and concept drift, need to be addressed in order to have a reliable solution. Therefore, this work has proposed a hybrid filter-wrapper feature selection (FS) algorithm, named Filter-Wrapper Feature Selection (FWFS). The algorithm selects robust features that represent minority classes and resistant to concept drift and is also computationally inexpensive by discarding irrelevant features before further processing with wrapper function. Based on the performance evaluation, the feature selection process of FWFS is computationally inexpensive; i.e. 59.6s, which produces a classifier with an overall accuracy of 98.9%. The result is better than state-of-the-art FS algorithm, Efficient Feature Optimization Approach (EFOA) which requires >400s to select features which can produced a classifier with 97.7% accuracy. In addition to the high overall accuracy, the classifier trained with features selected by FWFS has better F-measure values for each classes including minority classes; i.e. >0.8 in MULTIMEDIA and INTERACTIVE which consist only 0.15% and 0.03% instances, respectively, of the total 377,526 instances in the dataset. Furthermore, the classifier is stable and reliable for classifying new data; i.e. 98.7% accuracy for classifying new data and F-measure of more than 0.8 in every class. The classifier model will be embedded in the SDN-ISP traffic classification solution which provides insights for resource allocations and traffic scheduling in the network.
Keywords