IEEE Access (Jan 2023)

Transfer and Triangulation Pivot Translation Approaches for Burmese Dialects

  • Thazin Myint Oo,
  • Thitipong Tanprasert,
  • Ye Kyaw Thu,
  • Thepchai Supnithi

DOI
https://doi.org/10.1109/ACCESS.2023.3236804
Journal volume & issue
Vol. 11
pp. 6150 – 6168

Abstract

Read online

Parallel corpora for the languages of Myanmar (Beik, Burmese, Rakhine) are extremely scarce but a necessary requirement for machine translation R&D. Previous studies have proved that pivoting leads to better translation quality if the bridge language is closely related to the source and target language pair. The baseline study is conducted based on the three major approaches of machine translation; Weighted Finite State Transducer (WFST), Phrase-Based Statistical Machine Translation (PBSMT) and Deep Recurrent Neural Network (Deep-RNN). Based on the baseline results, this paper mainly investigated the pivot language technique for PBSMT with Burmese dialects. We employed two different pivot translation methods: transfer (sentence level) and triangulation (phrase level). We present the experimental results on Dawei-Beik, Beik-Dawei translations and Beik-Rakhine, Rakhine-Beik translation via Burmese. Both the transfer and triangulation approaches outperformed the baseline (direct translation), specifically for the Rakhine-Beik language pair. Moreover, the results of the average BiLingual Evaluation Understudy (BLEU), Character n-gram F-score (chrF), and Word Error Rate (WER) scores of the 10-fold cross-validation experiments proved that the triangulation pivot has significantly better acceleration than the transfer pivot. We plan to release the parallel corpora of Burmese dialects and present several avenues for further research.

Keywords