EPJ Web of Conferences (Jan 2020)
The Belle II Raw Data Management System
Abstract
The Belle II experiment, a major upgrade of the previous e+e− asymmetric collider experiment Belle, is expected to produce tens of petabytes of data per year due to the luminosity increase from the upgraded SuperKEKB accelerator. The distributed computing system of the Belle II experiment plays a key role, storing and distributing data in a reliable way to be easily accessed and analyzed by more than 1000 collaborators. In particular, the Belle II Raw Data Management system has been developed with an aim to upload output files onto grid storage, register them into the file and metadata catalogs, and make two replicas of the full raw data set using the Belle II Distributed Data Management system. It has been implemented as an extension of DIRAC (Distributed Infrastructure with Remote Agent Control) and consists of a database, services, client and monitoring tools, and several agents that treat the data automatically. The first year of data taken with the Belle II full detector has been managed by the Belle II Raw Data Management system successfully. The design, current status, and performance are presented. Prospects for improvements towards the full luminosity data taking are also reviewed.