EPJ Web of Conferences (Jan 2021)

CERN Tape Archive: a distributed, reliable and scalable scheduling system

  • Cano Eric,
  • Bahyl Vladimír,
  • Caffy Cédric,
  • Cancio Germán,
  • Davis Michael,
  • Keeble Oliver,
  • Kotlyar Viktor,
  • Leduc Julien,
  • Murray Steven

DOI
https://doi.org/10.1051/epjconf/202125102037
Journal volume & issue
Vol. 251
p. 02037

Abstract

Read online

The CERN Tape Archive (CTA) provides a tape backend to disk systems and, in conjunction with EOS, is managing the data of the LHC experiments at CERN. Magnetic tape storage offers the lowest cost per unit volume today, followed by hard disks and flash. In addition, current tape drives deliver a solid bandwidth (typically 360MB/s per device), but at the cost of high latencies, both for mounting a tape in the drive and for positioning when accessing non-adjacent files. As a consequence, the transfer scheduler should queue transfer requests before the volume warranting a tape mount is reached. In spite of these transfer latencies, user-interactive operations should have a low latency. The scheduling system for CTA was built from the experience gained with CASTOR. Its implementation ensures reliability and predictable performance, while simplifying development and deployment. As CTA is expected to be used for a long time, lock-in to vendors or technologies was minimized. Finally, quality assurance systems were put in place to validate reliability and performance while allowing fast and safe development turnaround.