EPJ Web of Conferences (Jan 2024)

A web workbench system for the Slurm cluster at IHEP

  • Du Ran,
  • Shi Jingyan,
  • Jiang Xiaowei,
  • Guo Chaoqi

DOI
https://doi.org/10.1051/epjconf/202429501007
Journal volume & issue
Vol. 295
p. 01007

Abstract

Read online

Slurm REST APIs are released since version 20.02. With those REST APIs one can interact with slurmctld and slurmdbd daemons in a REST- ful way. As a result, job submission and cluster status query can be achieved with a web system. To take advantage of Slurm REST APIs, a web workbench system is developed for the Slurm cluster at IHEP. The workbench system con- sists with four subsystems including dashboard, tomato, jasmine and cosmos. The dashboard subsystem is used to display cluster status including nodes and jobs. The tomato subsystem is developed to submit special HTCondor glidein jobs in the Slurm cluster. The jasmine system is used to generate and submit batch jobs based on workload parameters. The cosmos subsystem is an ac- counting system, which not only generates statistical charts but also provides REST APIs to query jobs. This paper presents design and implementation de- tails of the Slurm workbench. With the help of workbench, administrators and researchers can get their work done in an effective way.