IEEE Access (Jan 2020)

Packet Length Spectral Analysis for IoT Flow Classification Using Ensemble Learning

  • Gennaro Cirillo,
  • Roberto Passerone

DOI
https://doi.org/10.1109/ACCESS.2020.3012203
Journal volume & issue
Vol. 8
pp. 138616 – 138641

Abstract

Read online

With the proliferation of ubiquitous and autonomous devices for sensing, control, monitoring and conditioning, the Internet of Things (IoT) holds a great potential for the development of innovative applications. At the same time, network operators must support these devices with differentiated services, which rely on the ability to automatically recognize and classify the nature of the communication flows. In this paper, we present a supervised learning approach to discriminate between IoT and non-IoT traffic, and to determine the class of the device originating the packet flows. We make use of a reduced set of features based on the spectral analysis of the packet lengths of a flow, and evaluate an ensemble learning algorithm that uses a Random Forest classifier. We first discuss the datasets and the procedure that we use to extract the features, with examples from different devices. The evaluation is performed using both 10-fold cross validation and a split between training, validation and test-set. The latter is used for hyperparameter tuning. The results show that for reasonably large datasets the classifier achieves very high accuracy, as well as Precision and Recall rates. We further improve the performance on individual devices by selectively replicating the flows in the dataset, to achieve a better balance. We then evaluate a real-time implementation, and propose a runtime procedure to evaluate the model confidence level and trigger a retraining phase to adapt to a changing environment. A detailed analysis of the performance shows that the algorithm can support networks up to 100 Gbps on standard computing platforms.

Keywords