IEEE Access (Jan 2023)

TT-MLP: Tensor Train Decomposition on Deep MLPs

  • Jiale Yan,
  • Kota Ando,
  • Jaehoon Yu,
  • Masato Motomura

DOI
https://doi.org/10.1109/ACCESS.2023.3240784
Journal volume & issue
Vol. 11
pp. 10398 – 10411

Abstract

Read online

Deep multilayer perceptrons (MLPs) have achieved promising performance on computer vision tasks. Deep MLPs consist solely of fully-connected layers as the conventional MLPs do but adopt more sophisticated network architectures based on mixer layers composed of token-mixing and channel-mixing components. These architectures enable deep MLPs to have global receptive fields, but the significant increase of parameters becomes a massive burden on practical applications. To tackle this problem, we focus on using tensor-train decomposition (TTD) for compressing deep MLPs. At first, this paper analyzes deep MLPs under conventional TTD methods, especially using various designs of a macro framework and micro blocks: The former is how to concatenate mixer layers, and the latter is how to design a mixer layer. Based on the analysis, we propose a novel TTD method named Train-TTD-Train. The proposed method exerts the learning capability of channel-mixing components and improves the trade-off between accuracy and size. In the evaluation, the proposed method showed a better trade-off than conventional TTD methods on ImageNet-1K and achieved a 0.56% higher inference accuracy with a 15.44% memory reduction on Cifar-10.

Keywords