Applied Sciences (May 2025)
A Real-Time Semi-Supervised Log Anomaly Detection Framework for ALICE O<sup>2</sup> Facilities
Abstract
The ALICE (A Large Ion Collider Experiment) detector at the Large Hadron Collider (LHC), operated by the European Organization for Nuclear Research (CERN), is dedicated to heavy-ion collisions. Within ALICE, the application logs of the online computing systems are consolidated through a logging system known as Infologger, which integrates data from various sources. To identify potential anomalies, shifters in the control room manually review logs for anomalies, which require significant expertise and pose challenges due to the frequent onboarding of new personnel. To address this issue, we propose a real-time semi-supervised log anomaly detection framework designed to automatically detect anomalies in ALICE operations. The framework leverages BERTopic, a topic modeling technique, to provide real-time insights for incoming log messages for shifters. This includes an analytical dashboard that represents the anomaly status in log messages, facilitating informative monitoring for shifters. Through evaluation, including Infologger and BGL (BlueGene/L supercomputer), we analyze the effects of word embeddings, clustering algorithms, and HDBSCAN hyperparameters on model performance. The result demonstrates that the BERTopic can enhance the log anomaly detection process over traditional topic models, achieving remarkable performance metrics and attaining F1-scores of 0.957 and 0.958 for the InfoLogger and BGL datasets, respectively, even without the preprocessing technique.
Keywords