A Comparative Analysis of the TDCGAN Model for Data Balancing and Intrusion Detection

Mohammad Jamoos; Antonio M. Mora; Mohammad AlKhanafseh; Ola Surakhi

doi:10.3390/signals5030032

Signals (Sep 2024)

A Comparative Analysis of the TDCGAN Model for Data Balancing and Intrusion Detection

Mohammad Jamoos,
Antonio M. Mora,
Mohammad AlKhanafseh,
Ola Surakhi

Affiliations

Mohammad Jamoos: Department of Signal Theory, Telematics and Communications, University of Granada, 18012 Granada, Spain
Antonio M. Mora: Department of Signal Theory, Telematics and Communications, University of Granada, 18012 Granada, Spain
Mohammad AlKhanafseh: Department of Computer Science, Birzeit University, West Bank P.O. Box 14, Palestine
Ola Surakhi: Cybersecurity Department, American University of Madaba, Madaba 11821, Jordan

DOI: https://doi.org/10.3390/signals5030032
Journal volume & issue: Vol. 5, no. 3
pp. 580 – 596

Abstract

Read online

Due to the escalating network throughput and security risks, the exploration of intrusion detection systems (IDSs) has garnered significant attention within the computer science field. The majority of modern IDSs are constructed using deep learning techniques. Nevertheless, these IDSs still have shortcomings where most datasets used for IDS lies in their high imbalance, where the volume of samples representing normal traffic significantly outweighs those representing attack traffic. This imbalance issue restricts the performance of deep learning classifiers for minority classes, as it can bias the classifier in favor of the majority class. To address this challenge, many solutions are proposed in the literature. TDCGAN is an innovative Generative Adversarial Network (GAN) based on a model-driven approach used to address imbalanced data in the IDS dataset. This paper investigates the performance of TDCGAN by employing it to balance data across four benchmark IDS datasets which are CIC-IDS2017, CSE-CIC-IDS2018, KDD-cup 99, and BOT-IOT. Next, four machine learning methods are employed to classify the data, both on the imbalanced dataset and on the balanced dataset. A comparison is then conducted between the results obtained from each to identify the impact of having an imbalanced dataset on classification accuracy. The results demonstrated a notable enhancement in the classification accuracy for each classifier after the implementation of the TDCGAN model for data balancing.

Published in Signals

ISSN: 2624-6120 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Applied mathematics. Quantitative methods
Website: https://www.mdpi.com/journal/signals

About the journal

Abstract

Keywords