EPJ Web of Conferences (Jan 2024)
Updates to the ATLAS Data Carousel Project
Abstract
The High Luminosity upgrade to the LHC (HL-LHC) is expected to deliver scientific data at the multi-exabyte scale. In order to address this unprecedented data storage challenge, the ATLAS experiment launched the Data Carousel project in 2018. Data Carousel is a tape-driven workflow whereby bulk production campaigns with input data resident on tape are executed by staging and promptly processing a sliding window to disk buffer such that only a small fraction of inputs are pinned on disk at any one time. Data Carousel is now in production for ATLAS in Run3. In this paper, we provide updates on recent Data Carousel R&D projects, including data-on-demand and tape smart writing. Data-on-demand removes from disk data that has not been accessed for a predefined period, when users request them, they will be either staged from tape or recreated by following the original production steps. Tape smart writing employs intelligent algorithms for file placement on tape in order to retrieve data back more efficiently, which is our long term strategy to achieve optimal tape usage in Data Carousel.