IEEE Access (Jan 2023)
A Log Parsing Framework for ALICE O<sup>2</sup> Facilities
Abstract
The ALICE (A Large Ion Collider Experiment) detector at the European Organization for Nuclear Research (CERN) generates a substantial volume of experimental data, demanding efficient online and offline processing. To enhance the stability and reliability of the ALICE computing system, this study introduces an Artificial Intelligence-based logging system designed to detect, identify, and resolve issues through the analysis of system runtime information contained in logs. Existing online log parsing methods, however, often lack full automation and generality, relying instead on manual parameter definition and regular expressions that are better suited for static logs. In this study, we propose a novel and fully automated online log parsing framework for ALICE O2 (Online-Offline). To overcome key challenges, we employ the Term Frequency-Inverse Document Frequency (TF-IDF) algorithm to create ground truth, employ genetic programming to generate regular expressions, utilize the Artificial Bee Colony (ABC) algorithm for hyperparameter optimization, and implement a log template reduction algorithm to reduce similarity among log templates. Our framework’s effectiveness is validated through experiments on 5 benchmark log datasets and ALICE application logs, comparing its performance with the state-of-art online log parsing framework, Drain. The empirical results demonstrate the automated nature of our approach and its ability to achieve accurate parsing with high accuracy (i.e., 99.89% on the ALICE application log).
Keywords