Leveraging linear mapping for model-agnostic adversarial defense

Huma Jamil; Yajing Liu; Nathaniel Blanchard; Michael Kirby; Michael Kirby; Chris Peterson

doi:10.3389/fcomp.2023.1274832

Frontiers in Computer Science (Oct 2023)

Leveraging linear mapping for model-agnostic adversarial defense

Huma Jamil,
Yajing Liu,
Nathaniel Blanchard,
Michael Kirby,
Michael Kirby,
Chris Peterson

Affiliations

Huma Jamil: Department of Computer Science, Colorado State University, Fort Collins, CO, United States
Yajing Liu: Department of Mathematics, Colorado State University, Fort Collins, CO, United States
Nathaniel Blanchard: Department of Computer Science, Colorado State University, Fort Collins, CO, United States
Michael Kirby: Department of Computer Science, Colorado State University, Fort Collins, CO, United States
Michael Kirby: Department of Mathematics, Colorado State University, Fort Collins, CO, United States
Chris Peterson: Department of Mathematics, Colorado State University, Fort Collins, CO, United States

DOI: https://doi.org/10.3389/fcomp.2023.1274832
Journal volume & issue: Vol. 5

Abstract

Read online

In the ever-evolving landscape of deep learning, novel designs of neural network architectures have been thought to drive progress by enhancing embedded representations. However, recent findings reveal that the embedded representations of various state-of-the-art models are mappable to one another via a simple linear map, thus challenging the notion that architectural variations are meaningfully distinctive. While these linear maps have been established for traditional non-adversarial datasets, e.g., ImageNet, to our knowledge no work has explored the linear relation between adversarial image representations of these datasets generated by different CNNs. Accurately mapping adversarial images signals the feasibility of generalizing an adversarial defense optimized for a specific network. In this work, we demonstrate the existence of a linear mapping of adversarial inputs between different models that can be exploited to develop such model-agnostic, generalized adversarial defense. We further propose an experimental setup designed to underscore the concept of this model-agnostic defense. We train a linear classifier using both adversarial and non-adversarial embeddings within the defended space. Subsequently, we assess its performance using adversarial embeddings from other models that are mapped to this space. Our approach achieves an AUROC of up to 0.99 for both CIFAR-10 and ImageNet datasets.

Published in Frontiers in Computer Science

ISSN: 2624-9898 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/computer-science#

About the journal

Abstract

Keywords