IEEE Access (Jan 2019)
Hierarchical Transfer Learning Architecture for Low-Resource Neural Machine Translation
Abstract
Neural Machine Translation(NMT) has achieved notable results in high-resource languages, but still works poorly on low-resource languages. As times goes on, It is widely recognized that transfer learning methods are effective for low-resource language problems. However, existing transfer learning methods are typically based on the parent-child architecture, which does not adequately take advantages of helpful languages. In this paper, inspired by human transitive inference and learning ability, we handle this issue by proposing a new hierarchical transfer learning architecture for low-resource languages. In the architecture, the NMT model is trained in the unrelated high-resource language pair, the similar intermediate language pair and the low-resource language pair in turn. Correspondingly, the parameters are transferred and fine-tuned layer by layer for initialization. In this way, our hierarchical transfer learning architecture simultaneously combines the data volume advantages of high-resource languages and the syntactic similarity advantages of cognate languages. Specially, we utilize Byte Pair Encoding(BPE) and character-level embedding for data pre-processing, which effectively solve the problem of out of vocabulary(OOV). Experimental results on Uygur-Chinese and Turkish-English translation demonstrate the superiorities of the proposed architecture over the NMT model with parent-child architecture.
Keywords