Hierarchical Transfer Learning Architecture for Low-Resource Neural Machine Translation

Gongxu Luo; Yating Yang; Yang Yuan; Zhanheng Chen; Aizimaiti Ainiwaer

doi:10.1109/ACCESS.2019.2936002

IEEE Access (Jan 2019)

Hierarchical Transfer Learning Architecture for Low-Resource Neural Machine Translation

Gongxu Luo,
Yating Yang,
Yang Yuan,
Zhanheng Chen,
Aizimaiti Ainiwaer

Affiliations

Gongxu Luo: Xinjiang Laboratory of Minority Speech and Language Information Processing, The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
Yating Yang: ORCiD; Xinjiang Laboratory of Minority Speech and Language Information Processing, The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
Yang Yuan: Xinjiang Laboratory of Minority Speech and Language Information Processing, The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
Zhanheng Chen: Xinjiang Laboratory of Minority Speech and Language Information Processing, The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
Aizimaiti Ainiwaer: ORCiD; Xinjiang Laboratory of Minority Speech and Language Information Processing, The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China

DOI: https://doi.org/10.1109/ACCESS.2019.2936002
Journal volume & issue: Vol. 7
pp. 154157 – 154166

Abstract

Read online

Neural Machine Translation(NMT) has achieved notable results in high-resource languages, but still works poorly on low-resource languages. As times goes on, It is widely recognized that transfer learning methods are effective for low-resource language problems. However, existing transfer learning methods are typically based on the parent-child architecture, which does not adequately take advantages of helpful languages. In this paper, inspired by human transitive inference and learning ability, we handle this issue by proposing a new hierarchical transfer learning architecture for low-resource languages. In the architecture, the NMT model is trained in the unrelated high-resource language pair, the similar intermediate language pair and the low-resource language pair in turn. Correspondingly, the parameters are transferred and fine-tuned layer by layer for initialization. In this way, our hierarchical transfer learning architecture simultaneously combines the data volume advantages of high-resource languages and the syntactic similarity advantages of cognate languages. Specially, we utilize Byte Pair Encoding(BPE) and character-level embedding for data pre-processing, which effectively solve the problem of out of vocabulary(OOV). Experimental results on Uygur-Chinese and Turkish-English translation demonstrate the superiorities of the proposed architecture over the NMT model with parent-child architecture.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords