IEEE Access (Jan 2025)
Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning
Abstract
While Convolutional Neural Networks (CNNs) excel at learning complex latent-space representations, their over-parameterization can lead to overfitting and reduced performance, particularly with limited data. This, alongside their high computational and memory demands, limits the applicability of CNNs for edge deployment and applications where computational resources are constrained. Low-rank matrix approximation has emerged as a promising approach to reduce CNN parameters, but existing methods often require pre-determined ranks or involve complex post-training adjustments, leading to challenges in rank selection, performance loss, and limited practicality in resource-constrained environments. This underscores the need for an adaptive compression method that integrates into the training process, dynamically adjusting model complexity based on data and task requirements. To address this, we propose an efficient training method for CNN compression via dynamic parameter rank pruning. Our approach integrates efficient matrix factorization and novel regularization techniques, forming a robust framework for dynamic rank pruning and model compression. By using Singular Value Decomposition (SVD) to model low-rank convolutional filters and dense weight matrices, and training the SVD factors with back-propagation in an end-to-end manner, we achieve model compression. We evaluate our method on modern CNNs, including ResNet-18, ResNet-20, and ResNet-32, using datasets like CIFAR-10, CIFAR-100, and ImageNet (2012). Our experiments demonstrate that the proposed method can reduce model parameters by up to 50% and improve classification accuracy by up to 2% over baseline models, making CNNs more feasible for practical applications.
Keywords