All Your Fake Detector are Belong to Us: Evaluating Adversarial Robustness of Fake-News Detectors Under Black-Box Settings

Hassan Ali; Muhammad Suleman Khan; Amer Alghadhban; Meshari Alazmi; Ahmad Alzamil; Khaled Al-Utaibi; Junaid Qadir

doi:10.1109/ACCESS.2021.3085875

IEEE Access (Jan 2021)

All Your Fake Detector are Belong to Us: Evaluating Adversarial Robustness of Fake-News Detectors Under Black-Box Settings

Hassan Ali,
Muhammad Suleman Khan,
Amer Alghadhban,
Meshari Alazmi,
Ahmad Alzamil,
Khaled Al-Utaibi,
Junaid Qadir

Affiliations

Hassan Ali: ORCiD; IHSAN Lab, Information Technology University, Lahore, Pakistan
Muhammad Suleman Khan: ORCiD; Department of Computer Science, Information Technology University (ITU), Lahore, Pakistan
Amer Alghadhban: ORCiD; Department of Electrical Engineering, College of Engineering, University of Ha’il, Ha’il, Saudi Arabia
Meshari Alazmi: ORCiD; Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha’il, Ha’il, Saudi Arabia
Ahmad Alzamil: ORCiD; Department of Electrical Engineering, College of Engineering, University of Ha’il, Ha’il, Saudi Arabia
Khaled Al-Utaibi: ORCiD; Department of Computer Engineering, College of Computer Science and Engineering, University of Ha’il, Ha’il, Saudi Arabia
Junaid Qadir: ORCiD; Department of Electrical Engineering, Information Technology University (ITU), Lahore, Pakistan

DOI: https://doi.org/10.1109/ACCESS.2021.3085875
Journal volume & issue: Vol. 9
pp. 81678 – 81692

Abstract

Read online

With the hyperconnectivity and ubiquity of the Internet, the fake news problem now presents a greater threat than ever before. One promising solution for countering this threat is to leverage deep learning (DL)-based text classification methods for fake-news detection. However, since such methods have been shown to be vulnerable to adversarial attacks, the integrity and security of DL-based fake news classifiers are under question. Although many works study text classification under the adversarial threat, to the best of our knowledge, we do not find any work in literature that specifically analyzes the performance of DL-based fake-news detectors under adversarial settings. We bridge this gap by evaluating the performance of fake-news detectors under various configurations under black-box settings. In particular, we investigate the robustness of four different DL architectural choices—multilayer perceptron (MLP), convolutional neural network (CNN), recurrent neural network (RNN) and a recently proposed Hybrid CNN-RNN trained on three different state-of-the-art datasets—under different adversarial attacks (Text Bugger, Text Fooler, PWWS, and Deep Word Bug) implemented using the state-of-the-art NLP attack library, Text-Attack. Additionally, we explore how changing the detector complexity, the input sequence length, and the training loss affect the robustness of the learned model. Our experiments suggest that RNNs are robust as compared to other architectures. Further, we show that increasing the input sequence length generally increases the detector’s robustness. Our evaluations provide key insights to robustify fake-news detectors against adversarial attacks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords