Jisuanji kexue (Jan 2022)

Unsupervised Domain Adaptation Based on Style Aware

  • NING Qiu-yi, SHI Xiao-jing, DUAN Xiang-yu, ZHANG Min

DOI
https://doi.org/10.11896/jsjkx.201200094
Journal volume & issue
Vol. 49, no. 1
pp. 271 – 278

Abstract

Read online

In recent years,neural machine translation has made significant progress in translation quality,but it relies on parallel bilingual sentence pairs heavily during the training process.However,parallel resources are scarce for the e-commerce domain,in addition,cultural differences lead to stylistic differences in product information expression.To solve these two problems,a style-aware unsupervised domain adaptation algorithm is proposed,which makes full use of e-commerce monolingual data in the mutual training method,while introducing quasi knowledge distillation approach to deal with style differences.We construct non-parallel bilingual corpus by obtaining e-commerce product data information,and then carry out experiments based on the aforementioned corpus and Chinese and English news parallel corpus.The results show that the algorithm significantly improves translation qua-lity compared to various unsupervised domain adaptation methods,improves about 5 BLEU points compared with the strongest baseline system.In addition,the algorithm is further extended to Ted,Law and Medical OPUS data,all of which achieve better translation results.

Keywords