IEEE Access (Jan 2024)

Analytical Validation and Integration of CIC-Bell-DNS-EXF-2021 Dataset on Security Information and Event Management

  • Gyana Ranjana Panigrahi,
  • Prabira Kumar Sethy,
  • Santi Kumari Behera,
  • Manoj Gupta,
  • Farhan A. Alenizi,
  • Pannee Suanpang,
  • Aziz Nanthaamornphong

DOI
https://doi.org/10.1109/ACCESS.2024.3409413
Journal volume & issue
Vol. 12
pp. 83043 – 83056

Abstract

Read online

Contemporary culture presents a substantial obstacle for cyber security experts in the shape of software vulnerabilities, which, if taken advantage of, can jeopardize the Confidentiality, Integrity, and Availability (CIA) of any system. Data-driven and modern threat intelligence tools can enhance cyber security, bolster resilience, and foster innovation across cloud, multi-cloud, and hybrid platforms. As a result, performance evaluation and accuracy verification have become essential for Security Information and Event Management (SIEM) to prevent cyber threats. The SIEM system offers threat intelligence, reporting, and security incident management through the collection and analysis of event logs and other data sources that are specific to events and their context. We propose a hybrid strategy to address threat intelligence, reporting, and security incident management consisting of two layers that utilize a predefined set of characteristics. Here, we use RStudio to assess how well a hybrid intrusion detection system (HIDS) handles the CIC-Bell-DNS-EXF-2021 dataset. Furthermore, we have incorporated our developed model into Multi-Criteria Decision Analysis Methods (MCDM) to enhance the methods’ ability to identify complex DNS exfiltration attacks using machine learning algorithms: RF-AHP (RA), KNN-TOPSIS (KT), GBT-VIKOR (GV), and DT-Entropy-TOPSIS (DET). We consider several factors during the work, including accuracy, absolute error, weighted average recall, weighted average precision, kappa value, logistic loss, and root mean square deviation (RMSD). We use the Machine-Automated Model function to integrate and validate the models. According to the findings, GV has the highest level of accuracy, with a rate of 99.52%, while KT has the lowest level of authenticity, with a rate of 93.65%. Furthermore, these findings illustrate enhanced performance metrics for multiclass classification in comparison to previous approaches.

Keywords