Applied Sciences (Apr 2023)
Application of Machine Learning Algorithms for the Validation of a New CoAP-IoT Anomaly Detection Dataset
Abstract
With the rise in smart devices, the Internet of Things (IoT) has been established as one of the preferred emerging platforms to fulfil their need for simple interconnections. The use of specific protocols such as constrained application protocol (CoAP) has demonstrated improvements in the performance of the networks. However, power-, bandwidth-, and memory-constrained sensing devices constitute a weakness in the security of the system. One way to mitigate these security problems is through anomaly-based intrusion detection systems, which aim to estimate the behaviour of the systems based on their “normal” nature. Thus, to develop anomaly-based intrusion detection systems, it is necessary to have a suitable dataset that allows for their analysis. Due to the lack of a public dataset in the CoAP-IoT environment, this work aims to present a complete and labelled CoAP-IoT anomaly detection dataset (CIDAD) based on real-world traffic, with a sufficient trace size and diverse anomalous scenarios. The modelled data were implemented in a virtual sensor environment, including three types of anomalies in the CoAP data. The validation of the dataset was carried out using five shallow machine learning techniques: logistic regression, naive Bayes, random forest, AdaBoost, and support vector machine. Detailed analyses of the dataset, data conditioning, feature engineering, and hyperparameter tuning are presented. The evaluation metrics used in the performance comparison are accuracy, precision, recall, F1 score, and kappa score. The system achieved 99.9% accuracy for decision tree models. Random forest established itself as the best model, obtaining a 99.9% precision and F1 score, 100% recall, and a Cohen’s kappa statistic of 0.99.
Keywords