Journal of Big Data (Feb 2024)

Optimizing IoT intrusion detection system: feature selection versus feature extraction in machine learning

  • Jing Li,
  • Mohd Shahizan Othman,
  • Hewan Chen,
  • Lizawati Mi Yusuf

DOI
https://doi.org/10.1186/s40537-024-00892-y
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 44

Abstract

Read online

Abstract Internet of Things (IoT) devices are widely used but also vulnerable to cyberattacks that can cause security issues. To protect against this, machine learning approaches have been developed for network intrusion detection in IoT. These often use feature reduction techniques like feature selection or extraction before feeding data to models. This helps make detection efficient for real-time needs. This paper thoroughly compares feature extraction and selection for IoT network intrusion detection in machine learning-based attack classification framework. It looks at performance metrics like accuracy, f1-score, and runtime, etc. on the heterogenous IoT dataset named Network TON-IoT using binary and multiclass classification. Overall, feature extraction gives better detection performance than feature selection as the number of features is small. Moreover, extraction shows less feature reduction compared with that of selection, and is less sensitive to changes in the number of features. However, feature selection achieves less model training and inference time compared with its counterpart. Also, more space to improve the accuracy for selection than extraction when the number of features changes. This holds for both binary and multiclass classification. The study provides guidelines for selecting appropriate intrusion detection methods for particular scenarios. Before, the TON-IoT heterogeneous IoT dataset comparison and recommendations were overlooked. Overall, the research presents a thorough comparison of feature reduction techniques for machine learning-driven intrusion detection in IoT networks.

Keywords