Exploring the energy landscape of RBMs: reciprocal space insights into bosons, hierarchical learning and symmetry breaking

J Quetzalcóatl Toledo-Marin; Anindita Maiti; Geoffrey C Fox; Roger G Melko

doi:10.1088/2632-2153/adf521

Machine Learning: Science and Technology (Jan 2025)

Exploring the energy landscape of RBMs: reciprocal space insights into bosons, hierarchical learning and symmetry breaking

J Quetzalcóatl Toledo-Marin,
Anindita Maiti,
Geoffrey C Fox,
Roger G Melko

Affiliations

J Quetzalcóatl Toledo-Marin: ORCiD; TRIUMF , Vancouver, BC V6T 2A3, Canada; Perimeter Institute for Theoretical Physics , Waterloo, Ontario N2L 2Y5, Canada
Anindita Maiti: ORCiD; Perimeter Institute for Theoretical Physics , Waterloo, Ontario N2L 2Y5, Canada
Geoffrey C Fox: University of Virginia , Computer Science and Biocomplexity Institute, 994 Research Park Blvd, Charlottesville, VA 22911, United States of America
Roger G Melko: Perimeter Institute for Theoretical Physics , Waterloo, Ontario N2L 2Y5, Canada; Department of Physics and Astronomy, University of Waterloo , Waterloo, Ontario N2L 3G1, Canada

DOI: https://doi.org/10.1088/2632-2153/adf521
Journal volume & issue: Vol. 6, no. 3
p. 035030

Abstract

Read online

Deep generative models have become ubiquitous due to their ability to learn and sample from complex distributions. Despite the proliferation of various frameworks, the relationships among these models remain largely unexplored, a gap that hinders the development of a unified theory of AI learning. In this work, we address two central challenges: clarifying the connections between different deep generative models and deepening our understanding of their learning mechanisms. We focus on Restricted Boltzmann Machines (RBMs), a class of generative models known for their universal approximation capabilities for discrete distributions. By introducing a reciprocal space formulation for RBMs, we reveal a connection between these models, diffusion processes, and systems of coupled bosons. Our analysis shows that at initialization, the RBM operates at a saddle point, where the local curvature is determined by the singular values of the weight matrix, whose distribution follows the Marc̆enko-Pastur law and exhibits rotational symmetry. During training, this rotational symmetry is broken due to hierarchical learning, where different degrees of freedom progressively capture features at multiple levels of abstraction. This leads to a symmetry breaking in the energy landscape, reminiscent of Landau’s theory. This symmetry breaking in the energy landscape is characterized by the singular values and the weight matrix eigenvector matrix. We derive the corresponding free energy in a mean-field approximation. We show that in the limit of infinite size RBM, the reciprocal variables are Gaussian distributed. Our findings indicate that in this regime, there will be some modes for which the diffusion process will not converge to the Boltzmann distribution. To illustrate our results, we trained replicas of RBMs with different hidden layer sizes using the MNIST dataset. Our findings not only bridge the gap between disparate generative frameworks but also shed light on the fundamental processes underpinning learning in deep generative models.

Published in Machine Learning: Science and Technology

ISSN: 2632-2153 (Online)
Publisher: IOP Publishing
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://iopscience.iop.org/journal/2632-2153

About the journal

Abstract

Keywords