Vietnam Journal of Computer Science (May 2022)

Empirical Analysis of Phrase-Based Statistical Machine Translation System for English to Hindi Language

  • Arun Babhulgaonkar,
  • Shefali Sonavane

DOI
https://doi.org/10.1142/S219688882250004X
Journal volume & issue
Vol. 09, no. 02
pp. 135 – 162

Abstract

Read online

Hindi is the national language of India. However, most of the Government records, resolutions, news, etc. are documented in English which remote villagers may not understand. This fact motivates to develop an automatic language translation system from English to Hindi. Machine translation is the process of translating a text in one natural language into another natural language using computer system. Grammatical structure of Hindi language is very much complex than English language. The structural difference between English and Hindi language makes it difficult to achieve good quality translation results. In this paper, phrase-based statistical machine translation approach (PBSMT) is used for translation. Translation, reordering and language model are main working components of a PBSMT system. This paper evaluates the impact of various combinations of these PBSMT system parameters on automated English to Hindi language translation quality. Freely available n-gram-based BLEU metric and TER metric are used for evaluating the results.

Keywords