Department of Software Technology (ST), Faculty of Electrical Engineering Mathematics, and Computer Science (EEMCS), Delft University of Technology, Delft, The Netherlands
Henk Kant
Department of Software Technology (ST), Faculty of Electrical Engineering Mathematics, and Computer Science (EEMCS), Delft University of Technology, Delft, The Netherlands
Department of Software Technology (ST), Faculty of Electrical Engineering Mathematics, and Computer Science (EEMCS), Delft University of Technology, Delft, The Netherlands
Asterios Katsifodimos
Department of Software Technology (ST), Faculty of Electrical Engineering Mathematics, and Computer Science (EEMCS), Delft University of Technology, Delft, The Netherlands
Department of Software Technology (ST), Faculty of Electrical Engineering Mathematics, and Computer Science (EEMCS), Delft University of Technology, Delft, The Netherlands
Machine learning (ML) practitioners and organizations are building model repositories of pre-trained models, referred to as model zoos. These model zoos contain metadata describing the properties of the ML models and datasets. The metadata serves crucial roles for reporting, auditing, ensuring reproducibility, and enhancing interpretability. Despite the growing adoption of descriptive formats like datasheets and model cards, the metadata available in existing model zoos remains notably limited. Moreover, existing formats have limited expressiveness, thus constraining the potential use of model repositories, extending their purpose beyond mere storage for pre-trained models. This paper proposes a unified metadata representation format for model zoos. We illustrate that comprehensive metadata enables a diverse range of applications, encompassing model search, reuse, comparison, and composition of ML models. We also detail the design and highlight the implementation of an advanced model zoo system built on top of our proposed metadata representation.