Дискурс профессиональной коммуникации (Mar 2025)

Principles of Developing a Chinese-Russian Polysemantic Dictionary as a Means of Improving Interpretability of Neural Machine Translators

  • E. V. Chistova

DOI
https://doi.org/10.24833/2687-0126-2025-7-1-89-107
Journal volume & issue
Vol. 7, no. 1
pp. 89 – 107

Abstract

Read online

This research addresses the challenge of polysemy in neural machine translation (NMT), particularly for the Chinese-Russian language pair, known for its significant interlingual and intercultural asymmetry. Despite considerable advancements in NMT, the accurate translation of polysemous words remains a key obstacle to achieving high-quality automated text generation, often leading to misinterpretations and hindering effective communication. Currently, methodologies for developing specialized dictionaries that can effectively address this issue for NMT systems are lacking. This article aims to define the qualitative characteristics for detailed polysemantic dictionaries designed to enhance the interpretability of NMT, specifically for Chinese-Russian translation. The study employs eco-cognitive modeling of professional translator communication to investigate human-machine interaction in handling lexical ambiguity, focusing on the cognitive processes involved in disambiguation. Parallel Chinese-Russian texts serve as the material, subjected to manual processing to identify polysemous units challenging for NMT. The article proposes a theoretical framework for bilingual dictionary compilation based on this manual analysis, outlining principles for structuring dictionary entries to capture subtleties of lexical usage. The developed algorithm details the manual processing of parallel texts and the design of dictionary entry schemes tailored for NMT. The research identifies key qualitative characteristics for detailed Chinese-Russian parallel training corpora. These include linguistic and definitional parameters, comprehensive dictionary representation, and translation variability informed by lexico-grammatical compatibility, discourse-genre affiliation, and conceptual-categorical taxonomy. This study contributes to translation theory by offering a practical approach to enhance NMT interpretability through targeted dictionary development. The findings are relevant for improving machine translation quality, particularly for complex language pairs, ultimately facilitating more effective cross-lingual communication and knowledge exchange in different spheres, including business and academic research.

Keywords