IEEE Access (Jan 2023)

Implicit Cross-Lingual Word Embedding Alignment for Reference-Free Machine Translation Evaluation

  • Min Zhang,
  • Hao Yang,
  • Yanqing Zhao,
  • Xiaosong Qiao,
  • Shimin Tao,
  • Song Peng,
  • Ying Qin,
  • Yanfei Jiang

DOI
https://doi.org/10.1109/ACCESS.2023.3260835
Journal volume & issue
Vol. 11
pp. 32241 – 32251

Abstract

Read online

As we know, cross-lingual word embedding alignment is critically important for reference-free machine translation evaluation, where source texts are directly compared with system translations. In this paper, it is revealed that multilingual knowledge distillation for sentence embedding alignment could achieve cross-lingual word embedding alignment implicitly. A simplified analysis is given to explain the implicit alignment reason. And according to the analysis, it could be deduced that using the last layer embeddings of the distilled student model will have the best alignment effect, which is also validated by the experimental results on the WMT19 datasets. Furthermore, with the assistant of a target-side language model, BERTScore and Word Mover’s Distance using the cross-lingual word embeddings get very competitive results (4 best average scores on 3 types of language directions and ranking first among more than half of all 18 language pairs for the system-level evaluations) in the WMT19’s reference-free machine translation evaluation tasks when the current state-of-the-art (SOTA) metrics are chosen for comparison.

Keywords