o-bib. Das offene Bibliotheksjournal (May 2023)
Diversity and bias in DBpedia and Wikidata as a challenge for text-analysis tools
Abstract
Diversity Searcher is a tool originally developed to help analyse diversity in news media texts. It relies on automated content analysis and thus rests on prior assumptions and depends on certain design choices related to diversity. One such design choice is the external knowledge source(s) used. In this article, we discuss implications that these sources can have on the results of content analysis. We compare two data sources that Diversity Searcher has worked with – DBpedia and Wikidata – with respect to their ontological coverage and diversity, and describe implications for the resulting analyses of text corpora. We describe a case study of the relative over- or underrepresentation of Belgian political parties between 1990 and 2020. In particular, we found a staggering overrepresentation of the political right in the English-language DBpedia.
Keywords