Improving Transformer-Based Neural Machine Translation with Prior Alignments

Thien Nguyen; Lam Nguyen; Phuoc Tran; Huu Nguyen

doi:10.1155/2021/5515407

Complexity (Jan 2021)

Improving Transformer-Based Neural Machine Translation with Prior Alignments

Thien Nguyen,
Lam Nguyen,
Phuoc Tran,
Huu Nguyen

Affiliations

Thien Nguyen: Natural Language Processing and Knowledge Discovery Laboratory, Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam
Lam Nguyen: Natural Language Processing and Knowledge Discovery Laboratory, Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam
Phuoc Tran: Natural Language Processing and Knowledge Discovery Laboratory, Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam
Huu Nguyen: Faculty of Information Technology, Ho Chi Minh City University of Food Industry, Ho Chi Minh City, Vietnam

DOI: https://doi.org/10.1155/2021/5515407
Journal volume & issue: Vol. 2021

Abstract

Read online

Transformer is a neural machine translation model which revolutionizes machine translation. Compared with traditional statistical machine translation models and other neural machine translation models, the recently proposed transformer model radically and fundamentally changes machine translation with its self-attention and cross-attention mechanisms. These mechanisms effectively model token alignments between source and target sentences. It has been reported that the transformer model provides accurate posterior alignments. In this work, we empirically prove the reverse effect, showing that prior alignments help transformer models produce better translations. Experiment results on Vietnamese-English news translation task show not only the positive effect of manually annotated alignments on transformer models but also the surprising outperformance of statistically constructed alignments reinforced with the flexibility of token-type selection over manual alignments in improving transformer models. Statistically constructed word-to-lemma alignments are used to train a word-to-word transformer model. The novel hybrid transformer model improves the baseline transformer model and transformer model trained with manual alignments by 2.53 and 0.79 BLEU, respectively. In addition to BLEU score, we make limited human judgment on translation results. Strong correlation between human and machine judgment confirms our findings.

Published in Complexity

ISSN: 1076-2787 (Print); 1099-0526 (Online)
Publisher: Hindawi-Wiley
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.hindawi.com/journals/complexity/

About the journal