EPJ Web of Conferences (Jan 2024)
INFN and the evolution of distributed scientific computing in Italy
Abstract
INFN has been running a distributed infrastructure (the Tier-1 at Bologna-CNAF and 9 Tier-2 centres) for more than 20 years which currently offers about 150000 CPU cores and 120 PB of space both in tape and disk storage, serving more than 40 international scientific collaborations. This Grid-based infrastructure was augmented in 2019 with the INFN Cloud: a production quality multi-site federated Cloud infrastructure, composed by a core backbone, and which is able to integrate other INFN sites and public or private Clouds as well. The INFN Cloud provides a customizable and extensible portfolio offering computing and storage services spanning the IaaS, PaaS and SaaS layers, with dedicated solutions to serve special purposes, such as ISO-certified regions for the handling of sensitive data. INFN is now revising and expanding its infrastructure to tackle the challenges expected in the next 10 years of scientific computing adopting a “cloud-first” approach, through which all the INFN data centres will be federated via the INFN Cloud middleware and integrated with key HPC centres, such as the pre-exascale Leonardo machine at CINECA. In such a process, which involves both the infrastructures and the higher level services, initiatives and projects such as the "Italian National Centre on HPC, Big Data and Quantum Computing" (funded in the context of the Italian "National Recovery and Resilience Plan") and the Bologna Technopole are precious opportunities that will be exploited to offer advanced resources and services to universities, research institutions and industry. In this paper we describe how INFN is evolving its computing infrastructure, with the ambition to create and operate a national vendorneutral, open, scalable, and flexible "datalake" able to serve much more than just INFN users and experiments.