A hybrid machine learning method for increasing the performance of network intrusion detection systems

Achmad Akbar Megantara; Tohari Ahmad

doi:10.1186/s40537-021-00531-w

Journal of Big Data (Nov 2021)

A hybrid machine learning method for increasing the performance of network intrusion detection systems

Achmad Akbar Megantara,
Tohari Ahmad

Affiliations

Achmad Akbar Megantara: Department of Informatics, Institut Teknologi Sepuluh Nopember
Tohari Ahmad: Department of Informatics, Institut Teknologi Sepuluh Nopember

DOI: https://doi.org/10.1186/s40537-021-00531-w
Journal volume & issue: Vol. 8, no. 1
pp. 1 – 19

Abstract

Read online

Abstract The internet has grown enormously for many years. It is not just connecting computer networks but also a group of devices worldwide involving big data. The internet provides an opportunity to make various innovations for any sector, such as education, health, public facility, financial technology, and digital commerce. Despite its advantages, the internet may contain dangerous activities and cyber-attacks that may happen to anyone connected through the internet. To detect any cyber-attack intrudes on the network system, an intrusion detection system (IDS) is applied, which can identify those incoming attacks. The intrusion detection system works in two mechanisms: signature-based detection and anomaly-based detection. In anomaly-based detection, the quality of the machine learning model obtained is influenced by the data training process. The biggest challenge of machine learning methods is how to build an appropriate model to represent the dataset. This research proposes a hybrid machine learning method by combining the feature selection method, representing the supervised learning and data reduction method as the unsupervised learning to build an appropriate model. It works by selecting relevant and significant features using feature importance decision tree-based method with recursive feature elimination and detecting anomaly/outlier data using the Local Outlier Factor (LOF) method. The experimental results show that the proposed method achieves the highest accuracy in detecting R2L (i.e., 99.89%) and keeps higher for other attack types than most other research in the NSL-KDD dataset. Therefore, it has a more stable performance than the others. More challenges are experienced in the UNSW-NB15 dataset with binary classes.

Published in Journal of Big Data

ISSN: 2196-1115 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofbigdata.springeropen.com

About the journal

Abstract

Keywords