IEEE Access (Jan 2024)
Realtime Feature Engineering for Anomaly Detection in IoT Based MQTT Networks
Abstract
The MQTTset dataset has been extensively investigated for enhancing anomaly detection in IoT-based systems, with a focus on identifying Denial of Service (DoS) attacks. The research addresses a critical gap in MQTT traffic anomaly detection by proposing the incorporation of the ‘source’ attribute from PCAP files and utilizing hand-crafted feature engineering techniques. Various filtering methods, including data conversion, attribute filtering, handling missing values, and scaling, are employed. Anomalies are categorized and prioritized based on frequency of occurrence, with a specific emphasis on DoS attacks. The study compares the performance of the decision tree and its eight variant models (ID3, C4.5, Random Forest, CatBoost, LightGBM, XGBoost, CART, and Gradient Boosting) for anomaly detection in IoT-based systems. Evaluation metrics such as prediction accuracy, F1 score, and computational times (training and testing) are utilized. Hyperparameter fine-tuning techniques like grid search and random search are applied to enhance model performance, accuracy, and reduce computational costs. Results indicate that the benchmark Decision Tree model achieved 92.57% accuracy and a 92.38% F1 score with training and testing times of 2.95 seconds and 0.86 seconds, respectively. The Feature Engineering (Modified) dataset demonstrated a substantial improvement, reaching 98.56% accuracy and a 98.50% F1 score, with comparable training and testing times of 0.70 seconds and 0.02 seconds. Furthermore, the Modified Decision Tree Algorithm significantly improved accuracy to 99.27%, F1 score to 99.26%, and reduced training time to 0.73 seconds and testing time to 0.14 seconds. The research contributes valuable insights into feature engineering and guides the selection of effective approaches for anomaly detection in IoT-based systems, providing early threat warnings and enhancing overall system security and reliability.
Keywords