Systems (Oct 2024)

Dynamic Multi-Granularity Translation System: DAG-Structured Multi-Granularity Representation and Self-Attention

  • Shenrong Lv,
  • Bo Yang,
  • Ruiyang Wang,
  • Siyu Lu,
  • Jiawei Tian,
  • Wenfeng Zheng,
  • Xiaobing Chen,
  • Lirong Yin

DOI
https://doi.org/10.3390/systems12100420
Journal volume & issue
Vol. 12, no. 10
p. 420

Abstract

Read online

In neural machine translation (NMT), the sophistication of word embeddings plays a pivotal role in the model’s ability to render accurate and contextually relevant translations. However, conventional models with single granularity of word segmentation cannot fully embed complex languages like Chinese, where the granularity of segmentation significantly impacts understanding and translation fidelity. Addressing these challenges, our study introduces the Dynamic Multi-Granularity Translation System (DMGTS), an innovative approach that enhances the Transformer model by incorporating multi-granularity position encoding and multi-granularity self-attention mechanisms. Leveraging a Directed Acyclic Graph (DAG), the DMGTS utilizes four levels of word segmentation for multi-granularity position encoding. Dynamic word embeddings are also introduced to enhance the lexical representation by incorporating multi-granularity features. Multi-granularity self-attention mechanisms are applied to replace the conventional self-attention layers. We evaluate the DMGTS on multiple datasets, where our system demonstrates marked improvements. Notably, it achieves significant enhancements in translation quality, evidenced by increases of 1.16 and 1.55 in Bilingual Evaluation Understudy (BLEU) scores over traditional static embedding methods. These results underscore the efficacy of the DMGTS in refining NMT performance.

Keywords