BadNets: Evaluating Backdooring Attacks on Deep Neural Networks

Tianyu Gu; Kang Liu; Brendan Dolan-Gavitt; Siddharth Garg

doi:10.1109/ACCESS.2019.2909068

IEEE Access (Jan 2019)

BadNets: Evaluating Backdooring Attacks on Deep Neural Networks

Tianyu Gu,
Kang Liu,
Brendan Dolan-Gavitt,
Siddharth Garg

Affiliations

Tianyu Gu: Department of Electrical and Computer Engineering, New York University, New York City, NY, USA
Kang Liu: Department of Electrical and Computer Engineering, New York University, New York City, NY, USA
Brendan Dolan-Gavitt: Department of Computer Science and Engineering, New York University, New York City, NY, USA
Siddharth Garg: ORCiD; Department of Electrical and Computer Engineering, New York University, New York City, NY, USA

DOI: https://doi.org/10.1109/ACCESS.2019.2909068
Journal volume & issue: Vol. 7
pp. 47230 – 47244

Abstract

Read online

Deep learning-based techniques have achieved state-of-the-art performance on a wide variety of recognition and classification tasks. However, these networks are typically computationally expensive to train, requiring weeks of computation on many GPUs; as a result, many users outsource the training procedure to the cloud or rely on pre-trained models that are then fine-tuned for a specific task. In this paper, we show that the outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a BadNet) that has the state-of-the-art performance on the user's training and validation samples but behaves badly on specific attacker-chosen inputs. We first explore the properties of BadNets in a toy example, by creating a backdoored handwritten digit classifier. Next, we demonstrate backdoors in a more realistic scenario by creating a U.S. street sign classifier that identifies stop signs as speed limits when a special sticker is added to the stop sign; we then show in addition that the backdoor in our U.S. street sign detector can persist even if the network is later retrained for another task and cause a drop in an accuracy of 25% on average when the backdoor trigger is present. These results demonstrate that backdoors in neural networks are both powerful and-because the behavior of neural networks is difficult to explicate-stealthy. This paper provides motivation for further research into techniques for verifying and inspecting neural networks, just as we have developed tools for verifying and debugging software.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords