SoftwareX (May 2024)

SnapperML: A python-based framework to improve machine learning operations

  • Antonio Molner,
  • Francisco Carrillo-Perez,
  • Alberto Guillén

Journal volume & issue
Vol. 26
p. 101648

Abstract

Read online

Data Science has emerged as a vital discipline applicable across numerous industry sectors. However, achieving reproducibility in this field remains a challenging and unresolved problem. Additionally, transitioning trained models from development to production environments often proves to be a non-trivial task. In this study, we propose SnapperML, a comprehensive framework designed to address these issues by enabling practitioners to establish structured workflows that facilitate result reproducibility. Leveraging DevOps techniques, SnapperML ensures seamless model deployment from the lab to production, mitigating the risks associated with compatibility issues and model selection errors. The framework enables meticulous tracking of every aspect of model training, including hyperparameter selection, tuning, and distributed training on a server. By offering a suite of tools for model tracking and optimization, SnapperML presents a promising solution to the reproducibility challenge in the field of data science.

Keywords