Gutenberg Goes Neural: Comparing Features of Dutch Human Translations with Raw Neural Machine Translation Outputs in a Corpus of English Literary Classics

Rebecca Webster; Margot Fonteyne; Arda Tezcan; Lieve Macken; Joke Daems

doi:10.3390/informatics7030032

Informatics (Aug 2020)

Gutenberg Goes Neural: Comparing Features of Dutch Human Translations with Raw Neural Machine Translation Outputs in a Corpus of English Literary Classics

Rebecca Webster,
Margot Fonteyne,
Arda Tezcan,
Lieve Macken,
Joke Daems

Affiliations

Rebecca Webster: LT3, Language and Translation Technology Team, Ghent University, 9000 Ghent, Belgium
Margot Fonteyne: LT3, Language and Translation Technology Team, Ghent University, 9000 Ghent, Belgium
Arda Tezcan: LT3, Language and Translation Technology Team, Ghent University, 9000 Ghent, Belgium
Lieve Macken: LT3, Language and Translation Technology Team, Ghent University, 9000 Ghent, Belgium
Joke Daems: LT3, Language and Translation Technology Team, Ghent University, 9000 Ghent, Belgium

DOI: https://doi.org/10.3390/informatics7030032
Journal volume & issue: Vol. 7, no. 3
p. 32

Abstract

Read online

Due to the growing success of neural machine translation (NMT), many have started to question its applicability within the field of literary translation. In order to grasp the possibilities of NMT, we studied the output of the neural machine system of Google Translate (GNMT) and DeepL when applied to four classic novels translated from English into Dutch. The quality of the NMT systems is discussed by focusing on manual annotations, and we also employed various metrics in order to get an insight into lexical richness, local cohesion, syntactic, and stylistic difference. Firstly, we discovered that a large proportion of the translated sentences contained errors. We also observed a lower level of lexical richness and local cohesion in the NMTs compared to the human translations. In addition, NMTs are more likely to follow the syntactic structure of a source sentence, whereas human translations can differ. Lastly, the human translations deviate from the machine translations in style.

Published in Informatics

ISSN: 2227-9709 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/informatics

About the journal

Abstract

Keywords