An analysis of Google Translate and DeepL translation of source text typographical errors in the economic and legal fields

Santiago Rodríguez-Rubio Mediavilla

doi:10.58992/rld.i81.2024.4188

Revista de Llengua i Dret - Journal of Language and Law (Jun 2024)

An analysis of Google Translate and DeepL translation of source text typographical errors in the economic and legal fields

Santiago Rodríguez-Rubio Mediavilla

Affiliations

Santiago Rodríguez-Rubio Mediavilla

DOI: https://doi.org/10.58992/rld.i81.2024.4188

Abstract

Read online

Training neural machine translation systems with noisy data has been shown to improve robustness (Heigold et al., 2018). The objective of the present study is to test Google Translate and DeepL performance in the detection and correction of typographical errors, by introducing 1,820 source text typos found in a previous work on specialised Spanish-English dictionaries (Rodríguez-Rubio & Fernández-Quesada, 2020a, 2020b; Rodríguez-Rubio Mediavilla, 2021). Typos were introduced in isolation and also in co-text. Results showed that Google Translate clearly outperformed DeepL. Moreover, the repetition of the source typo was found to be the most frequent phenomenon in the machine translation output of both systems. By shedding light on the capacity of systems to deal with source text typographical errors, our study aims to provide a starting point for their refinement.

Published in Revista de Llengua i Dret - Journal of Language and Law

ISSN: 0212-5056 (Print); 2013-1453 (Online)
Publisher: Escola d'Administració Pública de Catalunya
Country of publisher: Spain
LCC subjects: Language and Literature: Romanic languages
Website: https://revistes.eapc.gencat.cat/index.php/rld

About the journal

Abstract

Keywords