SXAD: Shapely eXplainable AI-Based Anomaly Detection Using Log Data

Kashif Alam; Kashif Kifayat; Gabriel Avelino Sampedro; Vincent Karovic; Tariq Naeem

doi:10.1109/ACCESS.2024.3425472

IEEE Access (Jan 2024)

SXAD: Shapely eXplainable AI-Based Anomaly Detection Using Log Data

Kashif Alam,
Kashif Kifayat,
Gabriel Avelino Sampedro,
Vincent Karovic,
Tariq Naeem

Affiliations

Kashif Alam: ORCiD; Department of Computer Science, Faculty of Computing and AI, Air University, Islamabad, Pakistan
Kashif Kifayat: Department of Cyber Security, Faculty of Computing and AI, Air University, Islamabad, Pakistan
Gabriel Avelino Sampedro: ORCiD; School of Management and Information Technology, De La Salle-College of Saint Benilde, Manila, Philippines
Vincent Karovic: ORCiD; Department of Information Management and Business Systems, Faculty of Management, Comenius University Bratislava, Bratislava, Slovakia
Tariq Naeem: Department of Computer Science, Faculty of Computing and AI, Air University, Islamabad, Pakistan

DOI: https://doi.org/10.1109/ACCESS.2024.3425472
Journal volume & issue: Vol. 12
pp. 95659 – 95672

Abstract

Read online

Artificial Intelligence (AI) has made tremendous progress in anomaly detection. However, AI models work as a black-box, making it challenging to provide reasoning behind their judgments in a Log Anomaly Detection (LAD). To the rescue, Explainable Artificial Intelligence (XAI) improves system log analysis. It follows a white-box model for transparency, understandability, trustworthiness, and dependability of Machine Learning (ML) and Deep Learning (DL) Models. In addition, Shapely Additive Explanation (SHAP), added to system dynamics, makes informed judgments and adoptable proactive methods to optimize system functionality and reliability. Therefore, this paper proposed the Shapely eXplainable Anomaly Detection (SXAD) framework to identify different events (features) that impact the models’ interpretability, trustworthiness, and explainability. The framework utilizes the Kernel SHAP approach, which is based on Shapley values principle, providing an innovative approach to event selection and identifying specific events causing abnormal behavior. This study addresses the LAD by transforming it from a black-box model into a white-box one, leveraging XAI to make it transparent, interpretable, explainable, and dependable. It utilizes benchmark data from the Hadoop Distributed File System (HDFS), organized using a Drain parser, and employs several ML models, such as Decision Tree (DT), Random Forest (RF), and Gradient Boosting (GB). These models achieve impressive accuracy rates of 99.99%, 99.85%, and 99.99%, respectively. Our contribution are novel because no earlier work has been done in the area of Log Anomaly Detection (LAD) with integration of XAI-SHAP.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords