Journal of Big Data (Nov 2021)
A hybrid machine learning method for increasing the performance of network intrusion detection systems
Abstract
Abstract The internet has grown enormously for many years. It is not just connecting computer networks but also a group of devices worldwide involving big data. The internet provides an opportunity to make various innovations for any sector, such as education, health, public facility, financial technology, and digital commerce. Despite its advantages, the internet may contain dangerous activities and cyber-attacks that may happen to anyone connected through the internet. To detect any cyber-attack intrudes on the network system, an intrusion detection system (IDS) is applied, which can identify those incoming attacks. The intrusion detection system works in two mechanisms: signature-based detection and anomaly-based detection. In anomaly-based detection, the quality of the machine learning model obtained is influenced by the data training process. The biggest challenge of machine learning methods is how to build an appropriate model to represent the dataset. This research proposes a hybrid machine learning method by combining the feature selection method, representing the supervised learning and data reduction method as the unsupervised learning to build an appropriate model. It works by selecting relevant and significant features using feature importance decision tree-based method with recursive feature elimination and detecting anomaly/outlier data using the Local Outlier Factor (LOF) method. The experimental results show that the proposed method achieves the highest accuracy in detecting R2L (i.e., 99.89%) and keeps higher for other attack types than most other research in the NSL-KDD dataset. Therefore, it has a more stable performance than the others. More challenges are experienced in the UNSW-NB15 dataset with binary classes.
Keywords