EPJ Web of Conferences (Jan 2016)

PanDA: Exascale Federation of Resources for the ATLAS Experiment at the LHC

  • Megino Fernando Barreiro,
  • Bejar Jose Caballero,
  • De Kaushik,
  • Hover John,
  • Klimentov Alexei,
  • Maeno Tadashi,
  • Nilsson Paul,
  • Oleynik Danila,
  • Padolski Siarhei,
  • Panitkin Sergey,
  • Petrosyan Artem,
  • Wenaus Torre

DOI
https://doi.org/10.1051/epjconf/201610801001
Journal volume & issue
Vol. 108
p. 01001

Abstract

Read online

After a scheduled maintenance and upgrade period, the world’s largest and most powerful machine – the Large Hadron Collider(LHC) – is about to enter its second run at unprecedented energies. In order to exploit the scientific potential of the machine, the experiments at the LHC face computational challenges with enormous data volumes that need to be analysed by thousand of physics users and compared to simulated data. Given diverse funding constraints, the computational resources for the LHC have been deployed in a worldwide mesh of data centres, connected to each other through Grid technologies. The PanDA (Production and Distributed Analysis) system was developed in 2005 for the ATLAS experiment on top of this heterogeneous infrastructure to seamlessly integrate the computational resources and give the users the feeling of a unique system. Since its origins, PanDA has evolved together with upcoming computing paradigms in and outside HEP, such as changes in the networking model, Cloud Computing and HPC. It is currently running steadily up to 200 thousand simultaneous cores (limited by the available resources for ATLAS), up to two million aggregated jobs per day and processes over an exabyte of data per year. The success of PanDA in ATLAS is triggering the widespread adoption and testing by other experiments. In this contribution we will give an overview of the PanDA components and focus on the new features and upcoming challenges that are relevant to the next decade of distributed computing workload management using PanDA.