Cybersecurity (Mar 2024)
Maxwell’s Demon in MLP-Mixer: towards transferable adversarial attacks
Abstract
Abstract Models based on MLP-Mixer architecture are becoming popular, but they still suffer from adversarial examples. Although it has been shown that MLP-Mixer is more robust to adversarial attacks compared to convolutional neural networks (CNNs), there has been no research on adversarial attacks tailored to its architecture. In this paper, we fill this gap. We propose a dedicated attack framework called Maxwell’s demon Attack (MA). Specifically, we break the channel-mixing and token-mixing mechanisms of the MLP-Mixer by perturbing inputs of each Mixer layer to achieve high transferability. We demonstrate that disrupting the MLP-Mixer’s capture of the main information of images by masking its inputs can generate adversarial examples with cross-architectural transferability. Extensive evaluations show the effectiveness and superior performance of MA. Perturbations generated based on masked inputs obtain a higher success rate of black-box attacks than existing transfer attacks. Moreover, our approach can be easily combined with existing methods to improve the transferability both within MLP-Mixer based models and to models with different architectures. We achieve up to 55.9% attack performance improvement. Our work exploits the true generalization potential of the MLP-Mixer adversarial space and helps make it more robust for future deployments.
Keywords