EPJ Web of Conferences (Jan 2020)
Assessment of the ALICE O2 readout servers
Abstract
The ALICE experiment at CERN LHC (Large Hadron Collider) is undertaking a major upgrade during LHC Long Shutdown 2 in 2019-2020. The raw data input from the detector will then increase a hundredfold, up to 3.4 TB/s. In order to cope with such a large throughput, a new Online-Offline com-9 puting system, called O2, will be deployed. The FLP servers (First Layer Pro-10 cessor) are the readout nodes hosting the CRU (Common Readout Unit) cards in charge of transferring the data from the detector links to the computer mem-12 ory. The data then flow through a chain of software components until they are shipped over network to the processing nodes. In order to select a suitable plat-14 form for the FLP (First Level Processor), it is essential that the hardware and the software are tested together. Each candidate server is therefore equipped with multiple readout cards (CRU), one InfiniBand 100G Host Channel Adapter, and the O2 readout software suite. A series of tests are then run to ensure the readout system is stable and fulfils the data throughput requirement of 42Gbps (highest data rate in output of the FLP equipped with three CRUs). This paper presents the software and firmware features developed to evaluate and validate different candidates for the FLP servers. In particular we describe the data flow from the CRU firmware generating data, up to the network card where the buffers are sent over the network using RDMA. We also discuss the testing procedure and the results collected on different servers.