EPJ Web of Conferences (Jan 2020)

Migrating INFN-T1 from CREAM-CE/LSF to HTCondor-CE/HTCondor

  • Dal Pra S,
  • Fornari F,
  • Michelotto D,
  • Chierici A

DOI
https://doi.org/10.1051/epjconf/202024503037
Journal volume & issue
Vol. 245
p. 03037

Abstract

Read online

The INFN Tier-1 datacentre provides computing resources to several HEP and Astrophysics experiments. These are organized in Virtual Organizations submitting jobs to our computing facilities through Computing Elements, acting as Grid interfaces to the Local Resource Manager. We are phasing-out our current LRMS (IBM/Platform LSF 9.1.3) and CEs (CREAM) set to adopt HTCondor as a replacement for LSF and HTCondor-CE in place of CREAM. A small instance has been set up to practice with the cluster management and evaluate the feasibility of our migration plans to a new LRMS and CE set. A second cluster instance has been setup to work on production. A number of management tools have been adapted or rewritten in order to integrate the new system with the existing infrastructure. Two different accounting solution for the HTCondor-CE have been implemented, and the more reliable one have been adopted. A python tool has been written to disentangle the management of HTCondor machines from our puppet instance, and to enable a quicker configuration of the cluster nodes. The monitoring tools tied to the old system are being adapted to also work on the new one. Finally, the most relevant setup steps have been documented in a public wiki page and a support mailing has been created to help other INFN sites willing to migrate their LRMS and CE to HTCondor. This document reports about our experience with HTCondor-CE on top of HTCondor and the integration of this system into our infrastructure.