Digital Communications and Networks (Apr 2023)

Research on high-performance English translation based on topic model

  • Yumin Shen,
  • Hongyu Guo

Journal volume & issue
Vol. 9, no. 2
pp. 505 – 511

Abstract

Read online

Retelling extraction is an important branch of Natural Language Processing (NLP), and high-quality retelling resources are very helpful to improve the performance of machine translation. However, traditional methods based on the bilingual parallel corpus often ignore the document background in the process of retelling acquisition and application. In order to solve this problem, we introduce topic model information into the translation mode and propose a topic-based statistical machine translation method to improve the translation performance. In this method, Probabilistic Latent Semantic Analysis (PLSA) is used to obtains the co-occurrence relationship between words and documents by the hybrid matrix decomposition. Then we design a decoder to simplify the decoding process. Experiments show that the proposed method can effectively improve the accuracy of translation.

Keywords