Toward Memory-Efficient and Interpretable Factorization Machines via Data and Model Binarization

Yu Geng; Liang Lan; William K. Cheung

doi:10.1109/ACCESS.2023.3330779

IEEE Access (Jan 2023)

Toward Memory-Efficient and Interpretable Factorization Machines via Data and Model Binarization

Yu Geng,
Liang Lan,
William K. Cheung

Affiliations

Yu Geng: ORCiD; Department of Computer Science, Hong Kong Baptist University, Kowloon Tsai, Hong Kong
Liang Lan: ORCiD; Department of Interactive Media, Hong Kong Baptist University, Kowloon Tsai, Hong Kong
William K. Cheung: ORCiD; Department of Computer Science, Hong Kong Baptist University, Kowloon Tsai, Hong Kong

DOI: https://doi.org/10.1109/ACCESS.2023.3330779
Journal volume & issue: Vol. 11
pp. 128633 – 128643

Abstract

Read online

Factorization Machines (FM) is a general predictor that can efficiently model feature interactions in linear time, and thus has been broadly used for regression, classification and ranking tasks. Subspace Encoding Factorization Machine (SEFM) is one of the recent approaches which is proposed to enhance FM’s effectiveness by explicit nonlinear feature mapping for both individual features and feature interactions through equal-width binning per input feature. SEFM, despite its effectiveness, has a major drawback of increasing the memory cost of FM by $b$ times where $b$ is the number of bins adopted for the binning. To reduce the memory cost of SEFM, we propose Binarized FM (BiFM) in which each model parameter takes only a binary value (i.e., 1 or −1) and thus can be efficiently stored using one bit. We derive an algorithm which can learn the proposed FM with binary constraints using Straight Through Estimator (STE) with Adaptive Gradient Descent (Adagrad). For performance evaluation, we compare our proposed methods with a number of baselines based on eight different classification datasets. Our experimental results demonstrated that BiFM can achieve higher accuracy than SEFM at much less memory cost. BiFM also inherits the interpretability property from SEFM, and together with adaptive data binning methods can result in a more compact and interpretable set of classification rules.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords