EPJ Web of Conferences (Jan 2020)
The SIMPLE Framework for deploying containerized grid services
Abstract
The Worldwide LHC Computing Grid (WLCG) currently has about 170 sites. In order to support WLCG workloads, each site has to deploy and maintain a number of possibly complex grid services. Quite often, site managers require assistance of WLCG experts, for example when new software versions need to be deployed. Modern configuration management (e.g. Puppet, Ansible), container orchestration (e.g. Docker Swarm, Kubernetes) and containerization technologies (e.g. Docker, Podman) can help make such activities more lightweight by means of packaging sensible configurations of grid services and providing simple mechanisms to distribute and deploy them across the infrastructure available at a site. This article describes the SIMPLE project: a Solution for Installation, Management and Provisioning of Lightweight Elements. The SIMPLE framework leverages modern infrastructure management tools to deploy containerized grid services, such as popular compute elements (e.g. HTCondor, ARC), batch systems (e.g. HTCondor, Slurm), worker nodes, etc. Its architecture follows principles of sustainability, scalability and extensibility. We describe how system administrators can use the framework, as well as the first results, featuring the migration of computing resources to HTCondor at 2 sites. We conclude with an outlook on further developments.