o-bib. Das offene Bibliotheksjournal (May 2023)

Diversity and bias in DBpedia and Wikidata as a challenge for text-analysis tools

  • Bettina Berendt,
  • Oğuz Özgür Karadeniz,
  • Sercan Kıyak,
  • Stefan Mertens,
  • Leen d'Haenens

DOI
https://doi.org/10.5282/o-bib/5894
Journal volume & issue
Vol. 10, no. 2

Abstract

Read online

Diversity Searcher is a tool originally developed to help analyse diversity in news media texts. It relies on automated content analysis and thus rests on prior assumptions and depends on certain design choices related to diversity. One such design choice is the external knowledge source(s) used. In this article, we discuss implications that these sources can have on the results of content analysis. We compare two data sources that Diversity Searcher has worked with – DBpedia and Wikidata – with respect to their ontological coverage and diversity, and describe implications for the resulting analyses of text corpora. We describe a case study of the relative over- or underrepresentation of Belgian political parties between 1990 and 2020. In particular, we found a staggering overrepresentation of the political right in the English-language DBpedia.

Keywords