Clustering Approach for Detecting Multiple Types of Adversarial Examples

Seok-Hwan Choi; Tae-u Bahk; Sungyong Ahn; Yoon-Ho Choi

doi:10.3390/s22103826

Sensors (May 2022)

Clustering Approach for Detecting Multiple Types of Adversarial Examples

Seok-Hwan Choi,
Tae-u Bahk,
Sungyong Ahn,
Yoon-Ho Choi

Affiliations

Seok-Hwan Choi: School of Computer Science and Engineering, Pusan National University, Busan 46241, Korea
Tae-u Bahk: Korea Apparel Testing & Research Institute, Seoul 02579, Korea
Sungyong Ahn: School of Computer Science and Engineering, Pusan National University, Busan 46241, Korea
Yoon-Ho Choi: School of Computer Science and Engineering, Pusan National University, Busan 46241, Korea

DOI: https://doi.org/10.3390/s22103826
Journal volume & issue: Vol. 22, no. 10
p. 3826

Abstract

Read online

With intentional feature perturbations to a deep learning model, the adversary generates an adversarial example to deceive the deep learning model. As an adversarial example has recently been considered in the most severe problem of deep learning technology, its defense methods have been actively studied. Such effective defense methods against adversarial examples are categorized into one of the three architectures: (1) model retraining architecture; (2) input transformation architecture; and (3) adversarial example detection architecture. Especially, defense methods using adversarial example detection architecture have been actively studied. This is because defense methods using adversarial example detection architecture do not make wrong decisions for the legitimate input data while others do. In this paper, we note that current defense methods using adversarial example detection architecture can classify the input data into only either a legitimate one or an adversarial one. That is, the current defense methods using adversarial example detection architecture can only detect the adversarial examples and cannot classify the input data into multiple classes of data, i.e., legitimate input data and various types of adversarial examples. To classify the input data into multiple classes of data while increasing the accuracy of the clustering model, we propose an advanced defense method using adversarial example detection architecture, which extracts the key features from the input data and feeds the extracted features into a clustering model. From the experimental results under various application datasets, we show that the proposed method can detect the adversarial examples while classifying the types of adversarial examples. We also show that the accuracy of the proposed method outperforms the accuracy of recent defense methods using adversarial example detection architecture.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords