EPJ Web of Conferences (Jan 2020)
ATLAS Operational Monitoring Data Archival and Visualization
Abstract
The Information Service (IS) is an integral part of the Trigger and Data Acquisition (TDAQ) system of the ATLAS experiment at the Large Hadron Collider (LHC) at CERN. The IS allows online publication of operational monitoring data, and it is used by all sub-systems and sub-detectors of the experiment to constantly monitor their hardware and software components including more than 25000 applications running on more than 3000 computers. The Persistent Back-End for the ATLAS Information System (PBEAST) service stores all raw operational monitoring data for the lifetime of the experiment and provides programming and graphical interfaces to access them including Grafana dashboards and notebooks based on the CERN SWAN platform. During the ATLAS data taking sessions (for the full LHC Run 2 period) PBEAST acquired data at an average information update rate of 200 kHz and stored 20 TB of highly compacted and compressed data per year. This paper reports how over six years PBEAST became an essential piece of the experiment operations including details of the challenging requirements, the failures and successes of the various attempted implementations, the new types of monitoring data and the results of the time-series database technology evaluations for the improvements towards LHC Run 3.