EPJ Web of Conferences (Jan 2024)
Towards a distributed heterogeneous task scheduler for the ATLAS offline software framework*
Abstract
With the increased data volumes expected to be delivered by the HLLHC, it becomes critical for the ATLAS experiment to maximize the utilization of available computing resources ranging from conventional GRID clusters to supercomputers and cloud computing platforms. To run its data processing applications on these resources, the ATLAS software framework must be capable of efficiently executing data processing tasks in heterogeneous distributed computing environments. Today, using the Gaudi Avalanche Scheduler, whose implementation is based on Intel TBB, we can efficiently schedule Athena algorithms to multiple threads within a single compute node. We aim to develop a new framework scheduler capable of supporting distributed heterogeneous environments, based on technologies like HPX or Ray. After the initial evaluation phase of these technologies, we began the development of a prototype distributed task scheduler for the Athena framework. This contribution describes this prototype scheduler and the preliminary results of performance studies within ATLAS data processing applications.