Machine Translation in Low-Resource Languages by an Adversarial Neural Network

Mengtao Sun; Hao Wang; Mark Pasquine; Ibrahim A. Hameed

doi:10.3390/app112210860

Applied Sciences (Nov 2021)

Machine Translation in Low-Resource Languages by an Adversarial Neural Network

Mengtao Sun,
Hao Wang,
Mark Pasquine,
Ibrahim A. Hameed

Affiliations

Mengtao Sun: Department of ICT and Natural Sciences, Norwegian University of Science and Technology, 6009 Ålesund, Norway
Hao Wang: Department of Computer Science, Norwegian University of Science and Technology, 2815 Gjøvik, Norway
Mark Pasquine: Department of International Business, Norwegian University of Science and Technology, 6009 Ålesund, Norway
Ibrahim A. Hameed: Department of ICT and Natural Sciences, Norwegian University of Science and Technology, 6009 Ålesund, Norway

DOI: https://doi.org/10.3390/app112210860
Journal volume & issue: Vol. 11, no. 22
p. 10860

Abstract

Read online

Existing Sequence-to-Sequence (Seq2Seq) Neural Machine Translation (NMT) shows strong capability with High-Resource Languages (HRLs). However, this approach poses serious challenges when processing Low-Resource Languages (LRLs), because the model expression is limited by the training scale of parallel sentence pairs. This study utilizes adversary and transfer learning techniques to mitigate the lack of sentence pairs in LRL corpora. We propose a new Low resource, Adversarial, Cross-lingual (LAC) model for NMT. In terms of the adversary technique, LAC model consists of a generator and discriminator. The generator is a Seq2Seq model that produces the translations from source to target languages, while the discriminator measures the gap between machine and human translations. In addition, we introduce transfer learning on LAC model to help capture the features in rare resources because some languages share the same subject-verb-object grammatical structure. Rather than using the entire pretrained LAC model, we separately utilize the pretrained generator and discriminator. The pretrained discriminator exhibited better performance in all experiments. Experimental results demonstrate that the LAC model achieves higher Bilingual Evaluation Understudy (BLEU) scores and has good potential to augment LRL translations.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords