The reusability prior: comparing deep learning models without training

Aydın Göze Polat; Ferda Nur Alpaslan

doi:10.1088/2632-2153/acc713

Machine Learning: Science and Technology (Jan 2023)

The reusability prior: comparing deep learning models without training

Aydın Göze Polat,
Ferda Nur Alpaslan

Affiliations

Aydın Göze Polat: ORCiD; Department of Computer Engineering, Middle East Technical University , Ankara 06800, Turkey
Ferda Nur Alpaslan: ORCiD; Department of Computer Engineering, Middle East Technical University , Ankara 06800, Turkey

DOI: https://doi.org/10.1088/2632-2153/acc713
Journal volume & issue: Vol. 4, no. 2
p. 025011

Abstract

Read online

Various choices can affect the performance of deep learning models. We conjecture that differences in the number of contexts for model components during training are critical. We generalize this notion by defining the reusability prior as follows: model components are forced to function in diverse contexts not only due to the training data, augmentation, and regularization choices, but also due to the model design itself. We focus on the design aspect and introduce a graph-based methodology to estimate the number of contexts for each learnable parameter. This allows a comparison of models without requiring any training. We provide supporting evidence with experiments using cross-layer parameter sharing on CIFAR-10, CIFAR-100, and Imagenet-1K benchmarks. We give examples of models that share parameters outperforming baselines that have at least 60% more parameters. The graph-analysis-based quantities we introduced for the reusability prior align well with the results, including at least two important edge cases. We conclude that the reusability prior provides a viable research direction for model analysis based on a very simple idea: counting the number of contexts for model parameters.

Published in Machine Learning: Science and Technology

ISSN: 2632-2153 (Online)
Publisher: IOP Publishing
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://iopscience.iop.org/journal/2632-2153

About the journal

Abstract

Keywords