Information (May 2022)

Improving English-to-Indian Language Neural Machine Translation Systems

  • Akshara Kandimalla,
  • Pintu Lohar,
  • Souvik Kumar Maji,
  • Andy Way

DOI
https://doi.org/10.3390/info13050245
Journal volume & issue
Vol. 13, no. 5
p. 245

Abstract

Read online

Most Indian languages lack sufficient parallel data for Machine Translation (MT) training. In this study, we build English-to-Indian language Neural Machine Translation (NMT) systems using the state-of-the-art transformer architecture. In addition, we investigate the utility of back-translation and its effect on system performance. Our experimental evaluation reveals that the back-translation method helps to improve the BLEU scores for both English-to-Hindi and English-to-Bengali NMT systems. We also observe that back-translation is more useful in improving the quality of weaker baseline MT systems. In addition, we perform a manual evaluation of the translation outputs and observe that the BLEU metric cannot always analyse the MT quality as well as humans. Our analysis shows that MT outputs for the English–Bengali pair are actually better than that evaluated by BLEU metric.

Keywords