Revista de Llengua i Dret - Journal of Language and Law (Jun 2024)

An analysis of Google Translate and DeepL translation of source text typographical errors in the economic and legal fields

  • Santiago Rodríguez-Rubio Mediavilla

DOI
https://doi.org/10.58992/rld.i81.2024.4188

Abstract

Read online

Training neural machine translation systems with noisy data has been shown to improve robustness (Heigold et al., 2018). The objective of the present study is to test Google Translate and DeepL performance in the detection and correction of typographical errors, by introducing 1,820 source text typos found in a previous work on specialised Spanish-English dictionaries (Rodríguez-Rubio & Fernández-Quesada, 2020a, 2020b; Rodríguez-Rubio Mediavilla, 2021). Typos were introduced in isolation and also in co-text. Results showed that Google Translate clearly outperformed DeepL. Moreover, the repetition of the source typo was found to be the most frequent phenomenon in the machine translation output of both systems. By shedding light on the capacity of systems to deal with source text typographical errors, our study aims to provide a starting point for their refinement.

Keywords