Unlearning regularization for Boltzmann machines

Enrico Ventura; Simona Cocco; Rémi Monasson; Francesco Zamponi

doi:10.1088/2632-2153/ad5a5f

Machine Learning: Science and Technology (Jan 2024)

Unlearning regularization for Boltzmann machines

Enrico Ventura,
Simona Cocco,
Rémi Monasson,
Francesco Zamponi

Affiliations

Enrico Ventura: Dipartimento di Fisica, Sapienza Universitá di Roma , P.le A. Moro 2, 00185 Roma, Italy; Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL , F-75005 Paris, France
Simona Cocco: Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL , F-75005 Paris, France
Rémi Monasson: Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL , F-75005 Paris, France
Francesco Zamponi: ORCiD; Dipartimento di Fisica, Sapienza Universitá di Roma , P.le A. Moro 2, 00185 Roma, Italy; Laboratoire de Physique de l’Ecole Normale Supérieure, ENS, Université PSL , F-75005 Paris, France

DOI: https://doi.org/10.1088/2632-2153/ad5a5f
Journal volume & issue: Vol. 5, no. 2
p. 025078

Abstract

Read online

Boltzmann machines (BMs) are graphical models with interconnected binary units, employed for the unsupervised modeling of data distributions. When trained on real data, BMs show the tendency to behave like critical systems, displaying a high susceptibility of the model under a small rescaling of the inferred parameters. This behavior is not convenient for the purpose of generating data, because it slows down the sampling process, and induces the model to overfit the training-data. In this study, we introduce a regularization method for BMs to improve the robustness of the model under rescaling of the parameters. The new technique shares formal similarities with the unlearning algorithm, an iterative procedure used to improve memory associativity in Hopfield-like neural networks. We test our unlearning regularization on synthetic data generated by two simple models, the Curie–Weiss ferromagnetic model and the Sherrington–Kirkpatrick spin glass model. We show that it outperforms L _p -norm schemes and discuss the role of parameter initialization. Eventually, the method is applied to learn the activity of real neuronal cells, confirming its efficacy at shifting the inferred model away from criticality and coming out as a powerful candidate for actual scientific implementations.

Published in Machine Learning: Science and Technology

ISSN: 2632-2153 (Online)
Publisher: IOP Publishing
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://iopscience.iop.org/journal/2632-2153

About the journal

Abstract

Keywords