Open Computer Science (Jul 2018)

A methodology for the professional training of the management and evaluation of HPC systems

  • Skrinarova Jarmila,
  • Dudas Adam

DOI
https://doi.org/10.1515/comp-2018-0008
Journal volume & issue
Vol. 8, no. 1
pp. 68 – 79

Abstract

Read online

The paper is motivated by critical demand for experts and scientists working in areas of mathematical modeling, simulations, big data techniques and who are familiar with management of HPC systems from user and administrator point of view. We created a new course entitled “HPC system management”. Our goal is focused on students to provide them with knowledge and understanding of complex problem of the HPC system management concerning job scheduling. Important fact is that the job scheduling problem is an NP-complete problem. Next objective of our course is to educate skilled experts, who are able to design and implement programs, scripts and models doing job management to solve specific parts of this complex problem. The course is innovative from several points of view. Our new approach lies in specific content, which is oriented to the HPC system management in contrast to existing courses, which are usually focused on development of HPC applications. Also we developed and provide new education methodology in a form of scientific project, which decomposes the complex problem into subproblems and subsequently brings together solutions to the subproblems to form united model. New education methodology is focused on generation of (pseudo-) optimal jobs schedule using data from real systems. The huge volume of used data leads to ideas and methodologies of problem solving, which are suitable for problems not solvable in polynomial time. Educational methodology also contains implementation of a job scheduling simulator. The paper presents a pilot course, in which students explore various scheduling algorithms and research their properties with the use of data gained from NorduGrid

Keywords