Journal of Open Humanities Data (Nov 2023)
Opening a Free Path to Analyze the Discourse Shift in the Soviet Belarusian Newspaper 'Zviazda' after the Molotov-Ribbentrop Pact
Abstract
This paper attempts to develop a pipeline designed to convert graphical PDF files of the newspaper Zviazda into usable text data in the Belarusian language with search and visualization options. Apart from punctual conversion scripts to allow navigating between formats, the pipeline relies on freely available resources in order to process this relatively under-resourced language (at least for freely available resources). This pipeline was designed to include a graph database and to be compatible with data visualization tools. The ultimate goal is to develop a resource to analyze the political discourse in the Soviet Belarusian press during the Second World War. With a view to validating the pipeline, a pilot study was carried out: it aims to visualize some simple manifestations of the Soviet rhetorical shift about Nazi Germany after the signing of the Molotov-Ribbentrop Pact in order to prove that some useful phenomenon can be revealed even with quite noisy data.
Keywords