Knowing is Half the Battle: Enhancing Clean Data Accuracy of Adversarial Robust Deep Neural Networks via Dual-Model Bounded Divergence Gating

Hossein Aboutalebi; Mohammad Javad Shafiee; Chi-En Amy Tai; Alexander Wong

doi:10.1109/ACCESS.2023.3347498

IEEE Access (Jan 2024)

Knowing is Half the Battle: Enhancing Clean Data Accuracy of Adversarial Robust Deep Neural Networks via Dual-Model Bounded Divergence Gating

Hossein Aboutalebi,
Mohammad Javad Shafiee,
Chi-En Amy Tai,
Alexander Wong

Affiliations

Hossein Aboutalebi: ORCiD; Department of Computer Science, University of Waterloo, Waterloo, Canada
Mohammad Javad Shafiee: ORCiD; Department of Systems Design, University of Waterloo, Waterloo, Canada
Chi-En Amy Tai: ORCiD; Department of Systems Design, University of Waterloo, Waterloo, Canada
Alexander Wong: ORCiD; Department of Systems Design, University of Waterloo, Waterloo, Canada

DOI: https://doi.org/10.1109/ACCESS.2023.3347498
Journal volume & issue: Vol. 12
pp. 48174 – 48188

Abstract

Read online

Significant advances have been made in recent years in improving the robustness of deep neural networks, particularly under adversarial machine learning scenarios where the data has been contaminated to fool networks into making undesirable predictions. However, such improvements in adversarial robustness has often come at a significant cost in model accuracy when dealing with uncontaminated data (i.e., clean data), making such defense mechanisms challenging to adapt for real-world practical scenarios where data is primarily clean and accuracy needs to be high. Motivated to find a better balance between adversarial robustness and clean data accuracy, we propose a new model-agnostic adversarial defense mechanism named Dual-model Bounded Divergence (DBD), driven by a theoretical and empirical analysis of the bias-variance trade-off within an adversarial machine learning context. More specifically, the proposed DBD mechanism is premised on the observation that the variance in deep neural networks tends to increase in the presence of adversarial perturbations in the input data. As such, DBD employs a gating mechanism to decide on the final model prediction output based on a novel dual-model variance measure (coined DBD Variance), which is a bounded version of KL-Divergence between models. Not only is the proposed DBD mechanism itself training-free, but it can be combined with existing adversarial defense mechanisms to boost the balance between clean data accuracy and adversarial robustness. Comprehensive experimental results across over 10 different state-of-the-art adversarial defense mechanisms using both CIFAR-10 and ImageNet benchmark datasets across different adversarial attacks (e.g., APGD, AutoAttack) demonstrates that the integration of DBD can lead to as much as a 6% improvement on clean data accuracy without compromising much on adversarial robustness.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords