Applied Sciences (Jun 2023)
Improving Reliability for Detecting Anomalies in the MQTT Network by Applying Correlation Analysis for Feature Selection Using Machine Learning Techniques
Abstract
Anomaly detection (AD) has captured a significant amount of focus from the research field in recent years, with the rise of the Internet of Things (IoT) application. Anomalies, often known as outliers, are defined as the discovery of anomalous occurrences or observations that differ considerably from the mainstream of the data. The IoT which is described as a network of Internet-based digital sensors that continuously generate massive volumes of data and use to communicate with one another theMessage Queuing Telemetry Transport (MQTT) protocol. Brute-force, Denial-of-Service (DoS), Malformed, Flood, and Slowite attacks are the most common in theMQTT network. One of the significant factors in IoT AD is the time consumed to predict an attack and take preemptive measures. For instance, if an attack is detected late, the loss of attack is irreversible. This paper investigates the time to detect an attack using machine learning approaches and proposes a novel approach that applies correlation analysis to reduce the training and testing time of these algorithms. The new approach has been evaluated on Random Forest, Decision Tree, Naïve Bayes, Multi-Layer Perceptron, Artificial Neural Network, Logistic Regression, and Gradient Boost. The findings indicate that the correlation analysis is significantly beneficial in the process of feature engineering, primarily to determine the most relevant features in the MQTT dataset. This is, to the best of our knowledge, the first study on MQTTset that reduces the prediction time for DoS 0.92 (95% CI −0.378, 2.22) reduced to 0.77 (95% CI −0.414, 1.97) and for Malformed 2.92 (95% CI −2.6, 8.44) reduced to 0.49 (95% CI −0.273, 1.25).
Keywords