Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled

Michele Mastromattei; Leonardo Ranaldi; Francesca Fallucchi; Fabio Massimo Zanzotto

doi:10.7717/peerj-cs.859

PeerJ Computer Science (Feb 2022)

Syntax and prejudice: ethically-charged biases of a syntax-based hate speech recognizer unveiled

Michele Mastromattei,
Leonardo Ranaldi,
Francesca Fallucchi,
Fabio Massimo Zanzotto

Affiliations

Michele Mastromattei: Department of Enterprise Engineering, University of Roma “Tor Vergata”, Rome, Italy
Leonardo Ranaldi: Department of Innovation and Information Engineering, Guglielmo Marconi University, Rome, Italy
Francesca Fallucchi: Department of Innovation and Information Engineering, Guglielmo Marconi University, Rome, Italy
Fabio Massimo Zanzotto: Department of Enterprise Engineering, University of Roma “Tor Vergata”, Rome, Italy

DOI: https://doi.org/10.7717/peerj-cs.859
Journal volume & issue: Vol. 8
p. e859

Abstract

Read online Read online

Hate speech recognizers (HSRs) can be the panacea for containing hate in social media or can result in the biggest form of prejudice-based censorship hindering people to express their true selves. In this paper, we hypothesized how massive use of syntax can reduce the prejudice effect in HSRs. To explore this hypothesis, we propose Unintended-bias Visualizer based on Kermit modeling (KERM-HATE): a syntax-based HSR, which is endowed with syntax heat parse trees used as a post-hoc explanation of classifications. KERM-HATE significantly outperforms BERT-based, RoBERTa-based and XLNet-based HSR on standard datasets. Surprisingly this result is not sufficient. In fact, the post-hoc analysis on novel datasets on recent divisive topics shows that even KERM-HATE carries the prejudice distilled from the initial corpus. Therefore, although tests on standard datasets may show higher performance, syntax alone cannot drive the “attention” of HSRs to ethically-unbiased features.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords