IJCCS (Indonesian Journal of Computing and Cybernetics Systems) (Jul 2024)

Machine Translation Indonesian Bengkulu Malay Using Neural Machine Translation-LSTM

  • Bella Okta Sari Miranda,
  • Herman Yuliansyah,
  • Muhammad Kunta Biddinika

DOI
https://doi.org/10.22146/ijccs.98384
Journal volume & issue
Vol. 18, no. 3

Abstract

Read online

The machine translator is an application in Natural Language Processing (NLP) that focuses on translating between languages. Several previous research have used Statistical Machine Translation (SMT) with a parallel corpus of Indonesian and Bengkulu Malay totaling 3000 data points. However, SMT performs poorly when confronted with limited data and infrequent language pairs. Therefore, this study aims to build a machine translation model from Indonesian to Bengkulu Malay using an NMT approach with Long Short-Term Memory (LSTM), and to create a parallel corpus of 5261 data pairs between Indonesian and Bengkulu Malay. The research was conducted in three stages: data collection, data preprocessing, training and modeling, and evaluation. The performance of the machine translator was evaluated using the Bilingual Evaluation Understudy (BLEU). The evaluation results show that this model achieved the highest average score of 0.6016332 on BLEU-1 and the lowest average score of 0.3680788 on BLEU-4. These results indicate that considering the natural linguistic structural differences between Indonesian and Bengkulu Malay can be suggested as the best solution for translating from Indonesian to Bengkulu Malay.

Keywords