Journal of Open Humanities Data (Jul 2021)

The Telegram Chronicles of Online Harm

  • Tatjana Scheffler,
  • Veronika Solopova,
  • Mihaela Popa-Wyatt

DOI
https://doi.org/10.5334/johd.31
Journal volume & issue
Vol. 7

Abstract

Read online

Harmful language is frequent in social media, in particular in spaces which are considered anonymous and/or allow free participation. In this paper, we analyze the language in a Telegram channel populated by followers of former US President Donald Trump. We seek to identify the ways in which harmful language is used to create a specific narrative in a group of mostly like-minded discussants. Our research has several aims. First, we create an extended taxonomy of potentially harmful language that includes not only hate speech and direct insults (which have been the focus of existing computational methods), but also other forms of harmful speech discussed in the literature. We manually apply this taxonomy to a large portion of the corpus, including the time period leading up to and the aftermath of the January 2021 US Capitol riot. Our data gives empirical evidence for harmful speech, such as in/out-group divisive language and the use of codes within certain communities, that have not often been investigated before. Second, we compare our manual annotations of harmful speech to several automatic methods for classifying hate speech and offensive language, namely list-based and machine-learning-based approaches. We find that the Telegram data sets still pose particular challenges for these automatic methods. Finally, we argue for the value of studying such naturally-occurring, coherent data sets for research on online harm and how to address it in linguistics and philosophy.

Keywords