Who are the haters? A corpus-based demographic analysis of authors of hate speech

Lisa Hilte; Ilia Markov; Nikola Ljubešić; Nikola Ljubešić; Nikola Ljubešić; Darja Fišer; Darja Fišer; Darja Fišer; Walter Daelemans

doi:10.3389/frai.2023.986890

Frontiers in Artificial Intelligence (May 2023)

Who are the haters? A corpus-based demographic analysis of authors of hate speech

Lisa Hilte,
Ilia Markov,
Nikola Ljubešić,
Nikola Ljubešić,
Nikola Ljubešić,
Darja Fišer,
Darja Fišer,
Darja Fišer,
Walter Daelemans

Affiliations

Lisa Hilte: CLIPS, Department of Linguistics, Faculty of Arts, University of Antwerp, Antwerp, Belgium
Ilia Markov: CLTL, Department of Language, Literature and Communication, Faculty of Humanities, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
Nikola Ljubešić: Department of Knowledge Technologies, Institut Jožef Stefan (IJS), Ljubljana, Slovenia
Nikola Ljubešić: Laboratory for Cognitive Modeling, Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
Nikola Ljubešić: Institute of Contemporary History, Ljubljana, Slovenia
Darja Fišer: Department of Knowledge Technologies, Institut Jožef Stefan (IJS), Ljubljana, Slovenia
Darja Fišer: Institute of Contemporary History, Ljubljana, Slovenia
Darja Fišer: Department of Translation, Faculty of Arts, University of Ljubljana, Ljubljana, Slovenia
Walter Daelemans: CLIPS, Department of Linguistics, Faculty of Arts, University of Antwerp, Antwerp, Belgium

DOI: https://doi.org/10.3389/frai.2023.986890
Journal volume & issue: Vol. 6

Abstract

Read online

IntroductionWe examine the profiles of hate speech authors in a multilingual dataset of Facebook reactions to news posts discussing topics related to migrants and the LGBT+ community. The included languages are English, Dutch, Slovenian, and Croatian.MethodsFirst, all utterances were manually annotated as hateful or acceptable speech. Next, we used binary logistic regression to inspect how the production of hateful comments is impacted by authors' profiles (i.e., their age, gender, and language).ResultsOur results corroborate previous findings: in all four languages, men produce more hateful comments than women, and people produce more hate speech as they grow older. But our findings also add important nuance to previously attested tendencies: specific age and gender dynamics vary slightly in different languages or cultures, suggesting that distinct (e.g., socio-political) realities are at play.DiscussionFinally, we discuss why author demographics are important in the study of hate speech: the profiles of prototypical “haters” can be used for hate speech detection, for sensibilization on and for counter-initiatives to the spread of (online) hatred.

Published in Frontiers in Artificial Intelligence

ISSN: 2624-8212 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/artificial-intelligence#

About the journal

Abstract

Keywords