IEEE Access (Jan 2024)
Efficient Scaling of Bayesian Neural Networks
Abstract
While Bayesian neural networks (BNNs) have gained popularity for their theoretical guarantees and robustness, they have yet to see a convincing implementation at scale. This study investigates a variational inference-based neural architecture called Variational Density Propagation (VDP) that boasts noise robustness, self-compression and improved explanations over traditional (deterministic) neural networks. Due to the large computational burden associated with BNNs, however, these methods have yet to scale efficiently for real-world problems. In this study, we simplify the VDP architecture by reducing its time and space requirements and allowing for efficient scaling to ImageNet level problems. Additionally, we evaluate the inherent properties of the VDP method in order to validate the simplified method. Across all datasets and architectures, our method exhibits exceptional self-compression capabilities, retaining performance even with over 90% of its parameters pruned. The method also presents improved visual explanations via saliency maps, suggesting superior explanation quality compared to deterministic models. Lastly, we employ the VDP method to train a vision transformer on ImageNet-1k, something that was previously impossible due to the inherent computational constraints of the method. Our code has been made readily available at the link below.
Keywords