Improved Urdu-English Neural Machine Translation with a fully Convolutional Neural Network Encoder

Huma Israr; Muhammad Khuram Shahzad; Shahid Anwar

doi:10.33889/IJMEMS.2024.9.5.056

International Journal of Mathematical, Engineering and Management Sciences (Oct 2024)

Improved Urdu-English Neural Machine Translation with a fully Convolutional Neural Network Encoder

Huma Israr,
Muhammad Khuram Shahzad,
Shahid Anwar

Affiliations

Huma Israr: School of Electrical Engineering and Computer Science (SEECS), National University of Sciences and Technology (NUST), Islamabad, Pakistan.
Muhammad Khuram Shahzad: School of Electrical Engineering and Computer Science (SEECS), National University of Sciences and Technology (NUST), Islamabad, Pakistan.
Shahid Anwar: Department of Information Engineering Technology (IET), National Skills University, Islamabad, Pakistan.

DOI: https://doi.org/10.33889/IJMEMS.2024.9.5.056
Journal volume & issue: Vol. 9, no. 5
pp. 1067 – 1088

Abstract

Read online

Neural machine translation (NMT) approaches driven by artificial intelligence (AI) has gained more and more attention in recent years, mainly due to their simplicity yet state-of-the-art performance. Despite NMT models with attention mechanism relying heavily on the accessibility of substantial parallel corpora, they have demonstrated efficacy even for languages with limited linguistic resources. The convolutional neural network (CNN) is frequently employed in tasks involving visual and speech recognition. Implementing CNN for MT is still challenging compared to the predominant approaches. Recent research has shown that the CNN-based NMT model cannot capture long-term dependencies present in the source sentence. The CNN-based model can only capture the word dependencies within the width of its filters. This unnatural character often causes a worse performance for CNN-based NMT than the RNN-based NMT models. This study introduces a simple method to improve neural translation of a low-resource language, specifically Urdu-English (UR-EN). In this paper, we use a Fully Convolutional Neural Network (FConv-NN) based NMT architecture to create a powerful MT encoder for UR-EN translation that can capture the long dependency of words in a sentence. Although the model is quite simple, it yields strong empirical results. Experimental results show that the FConv-NN model consistently outperforms the traditional CNN-based model with filters. On the Urdu-English Dataset, the FConv-NN model produces translation with a gain of 18.42 BLEU points. Moreover, the quantitative and comparative analysis shows that in a low-resource setting, FConv-NN-based NMT outperforms conventional CNN-based NMT models.

Published in International Journal of Mathematical, Engineering and Management Sciences

ISSN: 2455-7749 (Online)
Publisher: Ram Arti Publishers
Country of publisher: India
LCC subjects: Technology; Science: Mathematics
Website: https://ijmems.in

About the journal

Abstract

Keywords