MatLit (Mar 2016)

Sentence-alignment and application of russian-german multi-target parallel corpora for linguistic analysis and literary studies

  • Zhekova, Desislava,
  • Zangenfeind, Robert,
  • Mikhaylova, Alena,
  • Nikolaienko, Tetiana

DOI
https://doi.org/10.14195/2182-8830_4-1_3
Journal volume & issue
Vol. 4, no. 1
pp. 45 – 61

Abstract

Read online

This paper presents the application of multi-target parallel corpora consisting of a single source text and multiple target translations of it for linguistic analysis. We discuss the alignment, interactive search and visualization of this type of data within a specific tool called ALuDo (Alignment with Lucene for Dostoyevsky). This is a Java implementation that uses local grammars, ontological information, bilingual dictionaries and statistical approaches for alignment and search. The data set in use is the Russian novel Crime and Punishment by Fyodor Dostoyevsky and three German translations of it. With this bilingual corpus quite a number of investigations in the field of linguistics and of literary studies are possible. Additionally, we release part of the resulting parallel corpus.

Keywords